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e present Applicatioij^is a Continuation-In-Part of copending U.S. Serial No. 
08/126,588 and copending U.S. Serial No. 08/126,595, both filed September 24, 
"±994^ which are both Continuations-In-Part of copending U.S. Serial No. 
07/980,498, filed November 23, 1992,» which is a Continuation-In-^rtgf 
10 copending U.S. Serial No. 07/854,296, filed March 19, 1992,»'Se disclosures^of 
which are hereby incorporated by reference in their entireties. Applicants claim 
the benefits of these Applications under 35 U.S.C. § 120. 
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The Applicants are authors or co-authors of several articles directed to the subject 
matter of the present invention. (1) Darnell et al.,"Interferon-Dependent 
Transcriptional Activation: Signal Transduction Without Second Messenger 
Involvement?" THE NEW B IOLOGIST . 2011}: 1-4, (1990); (2) X. Fu et al., 

20 "ISGF3, The Transcriptional Activator Induced by Interferon a. Consists of 
Multiple Interacting Polypeptide Chains" PROC. NATL. ACAD. SCI. USA . 
57:8555-8559 (1990); (3) D.S. Kessler et al., "IFNa Regulates Nuclear 
Translocation and DNA-Binding Affinity of ISGF3, A Multimeric Transcriptional 
Activator" GENES AND DEVELOPMENT . 4:1753 (1990). AU of the above 

25 listed articles are incorporated herein by reference. 



TECHNICAL FIELD OF THE INVENTTO]^ 



The present invention relates generally to intracellular receptor recognition 
30 proteins or factors(i.e. groups of proteins), and to methods and compositions 

including such factors or the antibodies reactive toward them, or analogs thereof in 
assays and for diagnosing, preventing and/or treating cellular debilitation, 
derangement or dysfunction. More particularly, the present invention relates to 
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particular IFN-dependent receptor recognition molecules that have been identified 
and sequenced, and that demonstrate direct participation in intracellular events, 
extending from interaction with the liganded receptor at the cell surface to 
transcription in the nucleus, and to antibodies or to other entities specific thereto 
5 that may thereby selectively modulate such activity in mammalian ceUs. 

BACKGROTTNO OF THF n^i^ni^ 

There are several possible pathways of signal transductidn that might be foUowed 
10 after a polypeptide ligand binds to its cognate cell surface receptor. Within 

minutes of such ligand-receptor interaction, genes tiiat were previously quiescent 
are rapidly transcribed (Murdoch et al., 1982; Lamer et al.. 1984; Friedman et 
al., 1984; Greenberg and Ziff, 1984; Greenberg et al., 1985). One of tiie most 
physiologically important, yet poorly understood, aspects of these immediate 
15 transcriptional responses is their specificity: the set of genes activated, for 

example, by platelet-derived growth factor (PDGF). does not completely overlap 
with the one activated by nerve growtii factor (NGF) or tumor necrosis factor 
(TNF) (Cochran et al., 1983; Greenberg et al., 1985; Almendral et al., 1988; Lee 
et al., 1990). The interferons (IFN) activate sets of otiier genes entirely. Even 
20 IFNa and IFN7, whose presence results in tiie slowing of ceU growtii and in an 
increased resistance to viruses (Tamm et al., 1987) do not activate exactiy tiie 
same set of genes (L^er et al., 1984; Friedman et al., 1984; CeUs et al., 1987, 
1985; Lamer et al., 1986). 

25 The current hypotiieses related to signal tiansduction patiiways in tiie cytoplasm do 
not adequately explain tiie high degree of specificity observed ill polypeptide- 
dependent transcriptional responses. The most commonly discussed patiiways of 
signal transduction tiiat might ultimately lead to tiie nucleus depend on properties 
of cell surface receptors containing tyrosine kinase domains [for example, PDGF, 

30 epidermal growtii factor (EGF). colony-stimulating factor (CSF), insulin-like 

growtii factor-1 aGF-1); see Gill, 1990; Hunter. 1990) or of receptors tiiat interact 



with G-proteins (GUman, 1987). These two groups of receptors mediate changes 
in the intraceUuIar concentrations of second messengers that, in turn, activate one 
of a series of protein phosphokinases, resulting in a cascade of phosphorylations 
(or dephosphorylations) of cytoplasmic proteins. 

5 

It has been widely conjectured that the cascade of phosphorylations secondary to 
changes in intracellular second messenger levels is responsible for variations in the 
rates of transcription of particular genes (Bourne, 1988, 1990; Berridge, 1987; 
Gill, 1990; Hunter. 1990). However, there are at least two reasons to question 
10 the suggestion that global changes in second messengers participate in the chain of 
events leading to specific transcriptional responses dependent on specific receptor 
occupation by polypeptide ligands. 



First, tiiere is a Umited number of second messengers (cAMP, diacyl glycerol, 
15 phosphoinositides, and Ca^"^ are the most prominentiy discussed), whereas the 

number of known cell surface receptor-ligand pairs of only the tyrosine kinase and 
G-protein varieties, for example, already greatiy outnumbers the list of second 
messengers, and could easily stretch into the hundreds (GUI, 1990; Hunter, 1990). 
In addition, since many different receptors can coexist on one cell type at any 
20 instant, a cell can be called upon to respond simultaneously to two or more 
different ligands with an individually specific ti^scriptional response each 
involving a different set of target genes. Second, a number of receptors for 
polypeptide ligands are now known tiiat have neither tyrosine kinase domains nor 
any stiiicture suggesting interaction with G-proteins. These include the receptors 
25 for interleukin-2 (IL-2) (Leonard et al., 1985), IFNa (Uze et al., 1990), IFN7 
(Aguet et al., 1988), NGF (Johnson et al., 1986), and growth l^ormone (Leung et 
al., 1987). The binding of each of these receptors to its specific ligand has been 
demonstrated to stimulate transcription of a specific set of genes. For these 
reasons it seems unUkely that global intracellular fluctuations in a limited set of 
30 second messengers are integral to the pathway of specific, polypeptide Ugand- 
dependent, immediate transcriptional responses. 



4 

In PCT International PubUcation No. WO 92/08740 published 29 May, 1992 by 
the appUcant herein, the above analysis was presented and it was discovered and 
proposed that a receptor recognition factor or factors, served in some capacity as a 
type of direct messenger between liganded receptors at the ceU surface and the ceU 
5 nucleus. One of the characteristics that was ascribed to the receptor recognition 
factor was its apparent lack of requirement for changes in second messenger 
concentrations. Continued investigation of the receptor recognition factor through 
study of the actions of the interferons IFNa and IFN7 has further elucidated the 
characteristics and structure of the interferon-related factor ISGF-3, and more 
10 broadly, the characterization and structure of the receptor recognition factor in a 
manner that extends beyond earlier discoveries previousljr described. It is 
accordingly to the presentation of this updated characterization of the receptor 
recognition factor and the materials and methods both diagnostic and therapeutic 
corresponding thereto that the present disclosure is directed. 
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SUNfMATtV OF THP TMVTrK|yTr>|.y 



In accordance with the present invention, receptor recognition factors have been 
further characterized that appear to interact direcUy with receptors that have been 
20 occupied by their ligand on cellular surfaces, and which in turn either become 
active transcription factors, or activate or directly associate with transcription 
factors that enter the cells' nucleus and specifically binds on predetermined sites 
and thereby activates the genes. It should be noted that the receptor recognition 
proteins thus possess multiple properties, among them: 1) recognizing and being 
25 activated during such recognition by receptors; 2) being translocated to the nucleus 
by an inhibitable process (eg. NaF inhibits translocation); and 3) combining with 
transcription activating proteins or acting themselves as transcription activation 
proteins, and that aU of these properties are possessed by the proteins described 
herein. 
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A further property of the receptor recognition factors (also termed herein signal 
transducers and activators of transcription — STAT) is dimerization to form 
homodimers or heterodimers upon activation by phosphorylation of tyrosine. In a 
specific embodiment, infra, Stat91 and Stat84 form homodimers and a Stat91- 
5 Stat84 heterodimer. Accordingly, the present invention is directed to such dimers, 
which can form spontaneously by phophorylation of the STAT protein, or which 
can be prepared synthetically by chemically cross-linking two like or unlike STAT 
proteins. 

10 The receptor recognition factor is proteinaceous in composition and is believed to 
be present in the cytoplasm,- The recognition factor is not demonstrably affected 
by concentrations of second messengers, however does exhibit direct interaction 
with tyrosine kinase domains, although it exhibits no apparent interaction with G- 
proteins. More particularly, as is shown in a co-pending, co-owned application 

15 entitled "INTERFERON-ASSOCIATED RECEPTOR RECOGNITION 

FACTORS, NUCLEIC ACIDS ENCODING THE SAME AND METHODS OF 
USE THEREOF," filed on even date herewith, the 91 kD human interferon 
(IFN) -7 factor, represented by SEQ ID NO:4 directly interacts with DNA after 
acquiring phosphate on tyrosine located at position 701 of the amino acid 

20 sequence. 

The recognition factor is now known to comprise several proteinaceous 
substituents, in the instance of IFNa and IFN7, Particularly, three proteins 
derived from the factor ISGF-3 have been successfully sequenced and their 
25 sequences are set forth in FIGURE 1 (SEQ ID N0S:1, 2), FIGURE 2 (SEQ ID 
NOS:3, 4) and FIGURE 3 (SEQ. ID NOS.5, 6) herein. Additionally, a murine 
gene encoding the 91 kD protein (SEQ ID NO:4^has been identified and 
sequenced. The nucleotide sequence (SEQ ID NO-^^d deduced amino acid 
sequence (SEQ ID NO:8) are shown in FIGURE 13A-13C. 
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In a further embodiment, murine genes encoding homologs of the recognition 
factor have been succefiiUy sequenced and cloned into plasmids. A gene in 
plasmid 13sfl has the nucleotide sequence (SEQ ID NO:9) and deduced amino 
acid sequence (SEQ ID NO: 10) as shown in FIGURE 14A-14C. A gene in 
plasmid 19sf6 has the nucleotide sequence (SEQ ID NO: 11) and deduced amino 
acid sequence (SEQ ID NO: 12) shown in FIGURE 15A-15C. 



It is particularly noteworthy that the protein sequence of FIGURE 1 (SEQ ID 
NO:2) and the sequence of the proteins of FIGURES 2 (SEQ ID NO:4) and 3 

10 (SEQ ID NO: 6) derive, respectively, from two different but related genes. 

Moreover, the protein sequence of FIGURE 13 (SEQ IDTSrO:8) derives from a 
murine gene that is analogous to the gene encoding the protein of FIGURE 2 (SEQ 
ID NO:4). Of further note is that the protein sequences of FIGURES 14 (SEQ ID 
NO: 10) and 15 (SEQ ID NO: 12) derive from two genes that are different from, 

15 but related to, the protein of FIGURE 13 (FIG ID NO:8). It is clear from these 
discoveries that a family of genes exists, and that further family members likewise 
exist. Accordingly, as demonstrated herein, by use of hybridization techniques, 
additional such family members will be found. 

20 Further, the capacity of such family members to function in the manner of the 
receptor recognition factors disclosed, herein may be assessed by determining 
those ligand that cause the phosphorylation of the particular family members. 
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In its broadest aspect, the present invention extends to a receptor recognition 
factor implicated in tiie transcriptional stimulation of genes in target cells in 
response to the binding of a specific polypeptide ligand to its cellular receptor on 
said target cell, said receptor recognition factor having the following 
characteristics: 

a) apparent direct interaction witii the ligand-bound receptor complex 
30 and activation of one or more transcription factors capable of binding witii a 
specific gene; 



b) an activity demonstrably unaffected by the presence or concentration 
of second messengers; 

c) direct interaction with tyrosine kinase domains; and 

d) a perceived absence of interaction with G-proteins. 

5 

In a further aspect, the receptor recognition (STAT) protein forms a dimer upon 
activation by phosphorylation. 

In a specific example, the receptor recognition fact9t represented by SEQ m 
^ 10 NO:4 possesses the added capabiHty of acting as a i^S^^^^ ^SX^ 
particular, as a DNA binding- protein in response to interfferon-7 stimulation. This 
discovery presages an expanded role for the proteins in question, and other 
proteins and like factors that have heretofore been characterized as receptor 
recognition factors. It is therefore apparent that a single factor may indeed 
15 provide the nexus between the Uganded receptor at the cell surface and direct 

participation in DNA transcriptional activity in the nucleus. This pleiotypic factor 
has the following characteristics: 

a) It interacts with an interferon-7-bound receptor kinase complex; 

b) It is a tyrosine kinase substi^te; and 

^) phosphorylated, it serves as a DNA binding protein. 

More particularly, the factor represented by SEQ ID NO:4 is interferon-dependent 
in its activity and is responsive to interferon stimulation, particularly that of 
interferon-7. It has further been discovered that activation of the factor 
25 represented by SEQ ID NO:4 requires phosphorylation of tyrosine-701 of the 
protein, and further stiU that tyrosine phosphorylation requires tiie presence of a 
functionally active SH2 domain in tiie protein. Preferably, such SH2 domain 
contains an amino acid residue corresponding to an arginine at position 602 of the 
protein. 
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In a still further aspect, the present invention extends to a receptor recognition 
factor interactive with a liganded interferon receptor, which receptor recognition 
factor possesses the following characteristics: 
a) it is present in cytoplasm; 

5 b) it undergoes tyrosine phosphorylation upon treatment of cells with IFNa 

or IFN7; 

c) it activates transcription of an interferon stimulated gene; 

d) it stimulates either an ISRE-dependent or a gamma activated site 
(GAS)-dependent transcription in vivo; 

10 e) it interacts with IFN cellular receptors, and 

f) it undergoes nuclear translocation upon stimulation of the IFN cellular 
receptors with IFN. 



The factor of the invention represented by SEQ ID NO:4 appears to act in similar 
15 fashion to an earlier determined site-specific DNA binding protein that is 

interferon-7 dependent and that has been earlier called the 7 activating factor 
(GAF). Specifically, interferon-7-dependent activation of this factor occurs 
without new protein synthesis and appears within minutes of interferon-7 
treatment, achieves maximum extent between 15 and 30 minutes thereafter, and 
then disappears after 2-3 hours. These further characteristics of identification and 
action assist in the evaluation of the present factor for applications having both 
diagnostic and therapeutic significance. 

In a particular embodiment, the present invention relates to all members of the 
herein disclosed family of receptor recognition factors except the 91 kD protein 
factors, specifically the proteins whose sequences are represented by one or more 
of SEQ ID N0:4, SEQ ID N0:6 or SEQ ID NO:8. 

"H^e-preseni invention also r elates toajnx j jnbinai^ or cloned gene, 

or a degeneratejiariairtnllerTOCw^ encodes a receptor recognition factor, or a 
fiagilieurTheieol', tnat posse sses a molecular we ight-of about 113 kP and an amino ' 
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acid sequence set forth in FIGURE 1 (SEQ ID NO; 2); preferably a nuckac acid 
molecule, in particular a recombinant DNA molecule or cloned gene, yfincoding the 
13 kD receptor recognition factor has a nucleotide sequence or is/wmplementary 
to a DNA sequence shown in FIGURE 1 (SEQ ID NO:l). In af^ther 
embodiment, the receptor recognition factor has a molecular/weight of about 91 
kD and the amino acid sequence set forth in FIGURE 2 fJfEQ ID NO: 4) or 
FIGURE 13 (SEQ ID NO: 8); preferably a nucleic acid/molecule, in particular a 
recombinant DNA molecule or cloned gene, encoding the 91 kD receptor 
recognition factor has a nucleotide sequence or is^mplementary to a DNA 
0^10 «|flec«- shown in FIGURE 2 (SEQ ID NO:3)/6r FIGURE 13 (SEQ ID NO:8), In 
yet a further embodiment, the receptor recognition factor-has a molecular weight 
of about 84 kD and the amino acid sequ^ce set forth in FIGURE 3 (SEQ ID 
NO: 6); preferably a nucleic acid molecule, in particular a recombinant DNA 
molecule or cloned gene, encodingyttie 84 kD receptor recognition factor has a 
nucleotide sequence or is complementary to a DNA Trfjnrra shown in FIGURE 3 
(SEQ ID NO:5). In yet another embodiment, the receptor recognition factor has 
an amino acid sequence set^rorth in FIGURE 14 (SEQ ID NO: 10); preferably a 
nucleic acid molecule, inyparticular a recombinant DNA molecule or cloned gene, 
encoding such receptor/recognition factor has a nucleotide sequence or is 
complementary to a E)NA s^^^^^own in FIGURE 14 (SEQ ED NO:9). In still 
another embodiment, the receptor recognition factor has an amino acid sequence 
set forth in FIGJ3RE 15 (SEQ ID NO: 12); preferably a nucleic acid molecule, in 
particular a recombinant DNA molecule or cloned gene, encoding such receptor , 
r^o^n^^ryfactor has a nucleotide sequence or is complementary to a DNA 
soqncce sHown in FIGURE 15 (SEQ ID NO: 11). 
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The human and murine DNA sequences of the receptor recognition factors of the 
present invention or portions thereof, may be prepared as probes to screen for 
complementary sequences and genomic clones in the same or alternate species. 
The present invention extends to probes so prepared that may be provided for 
screening cDNA and genomic libraries for the receptor recognition factors. For 
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example, the probes may be prepared with a variety of known vectors, such as the 
phage X vector. The present invention also includes the preparation of plasmids 
including such vectors, and the use of the DNA sequences to construct vectors 
expressing antisense RNA or ribozymes which would attack the mRNAs of any or 
5 all of the DNA sequences set forth in FIGURES 1, 2, 3, 13, 14 and 15 (SEQ ID 
NOS:l, 3, 5, 7, 9, and 11, respectively)* Correspondingly, the preparation of 
antisense RNA and ribozymes are included herein. 

The present invention also includes receptor recognition factor proteins having the 
10 activities noted herein, and that display the amino acid sequences set forth and 
described above and selected from SEQ ID NO:2, SEQ H3 NO:4, SEQ ID NO:6, 
SEQ ID NO:8, SEQ ID NO: 10 and SEQ ID NO: 12, 

In a further embodiment of the invention, the full DNA sequence of the 
15 recombinant DNA molecule or cloned gene so determined may be operatively 
linked to an expression control sequence which may be introduced into an 
appropriate host. The invention accordingly extends to unicellular hosts 
transformed with the cloned gene or recombinant DNA molecule comprising a 
DNA sequence encoding the present receptor recognition factor(s), and more 
20 particularly, the complete DNA sequence determined from the sequences set forth 
above and in SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ 
ID NO:9 and SEQ ID NO: 11. 

According to other preferred features of certain preferred embodiments of the 
25 present invention, a recombinant expression system is provided; to produce 
biologically active animal or human receptor recognition factor. 

The concept of the receptor recognition factor contemplates that specific factors 
exist for correspondingly specific ligands, such as tumor necrosis factor, nerve 
30 growth factor and the like, as described earlier. Accordingly, the exact structure 
of each receptor recognition factor will understandably vary so as to achieve this 
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ligand and activity specificity. It is this specificity and the direct involvement of 
the receptor recognition factor in the chain of events leading to gene activation, 
that offers the promise of a broad spectrum of diagnostic and therapeutic utilities. 

5 The present invention naturally contemplates several means for preparation of the 
recognition factor, including as illustrated herein known recombinant techniques, 
and the invention is accordingly intended to cover such synthetic preparations 
within its scope. The isolation of the cDNA amino acid sequences disclosed 
herein facilitates the reproduction of the recognition factor by such recombinant 
10 techniques, and accordingly, the invention extends to expression vectors prepared 
from the disclosed DNA sequences for expression in host systems by recombinant 
itp DNA techniques, and to the resulting transformed hosts. 

m 

The invention includes an assay system for screening of potential drugs effective to 
15 modulate transcriptional activity of target mammalian cells by interrupting or 

potentiating the recognition factor or factors. In one instance, the test drug could 
be administered to a cellular sample with the ligand that activates the receptor 
PI recognition factor, or an extract containing the activated recognition factor, to 

determine its effect upon the binding activity of the recognition factor to any 
20 chemical sample (including DNA), or to the test drug, by comparison with a 
control. 



m 



CI 



The assay system could more importantly be adapted to identify drugs or other - 
entities that are capable of binding to the receptor recognition and/or transcription 

25 factors or proteins, either in the cytoplasm or in the nucleus, thereby inhibiting or 
potentiating transcriptional activity. Such assay would be useful in the 
development of drugs that would be specific against particular cellular activity, or 
that would potentiate such activity, in time or in level of activity. For example, 
such drugs might be used to modulate cellular response to shock, or to treat other 

30 pathologies, as for example, in making IFN more potent against cancer. 
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In yet a further embodiment, the invention contemplates antagonists of the activity 
of a receptor recognition factor (STAT), In particular, an agent or molecule that 
inhibits dimerization (homodimerization or heterodimerization) can be used to 
block transcription activation effected by an acitvated, phosphorylated STAT 

5 protein. In a specific embodiment, the antagonist can be a peptide having the 

sequence of a portion of an SH2 domain of a STAT protein, or the phophotyrosine 
domaine of a STAT protein, or both. If the peptide contains both regions, 
preferably the regions are located in tandem, more preferably with the SH2 
domain portion N-terminal to the phosphotyrosine portion. In a specific example, 

10 infra, such peptides are shown to be capable of disrupting dimerization of STAT 
proteins. • - 



ifii One of the characteristics of the present receptor recognition factors is their 

participation in rapid phosphorylation and dephosphorylation during the course of 
15 and as part of their activity. Significantly, such phosphorylation takes place in an 
interferon-dependent manner and within a few minutes in the case of the ISGF-3 
proteins identified herein, on the tyrosine residues defined thereon. This is strong 
evidence that the receptor recognition factors disclosed herein are the first true 
substrates whose intracellular function is well understood and whose intracellular 
20 activity depends on tyrosine kinase phosphorylation. In particular, the addition of 
phosphate to the tyrosine of a transcription factor is novel. This suggests further 
that tyrosine kinase takes direct action in the transmission of intracellular signals to 
the nucleus, and does not merely serve as a promoter or mediator of serine and/or 
serinine kinase activity, as has been theorized to date. Also, the role of the factor 
25 represented by SEQ ID NO:2 in its activated phosphorylated form suggests 

possible independent therapeutic use for this activated form. Likewise, the role of 
the factor as a tyrosine kinase substrate suggests its interaction with kinase in other 
theatres apart from the complex observed herein. 
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The diagnostic utility of the present invention extends to the use of the present 
receptor recognition factors in assays to screen for tyrosine kinase inhibitors. 




Because the activity of the receptor recognition-transcriptional activation proteins 
described herein must maintain tyrosine phosphorylation, they can and presumably 
are dephosphorylated by specific tyrosine phosphatases. Blocking of the specific 
phosphatase is therefore an avenue of pharmacological intervention that would 
5 potentiate the activity of the receptor recognition proteins. 

The present invention likewise extends to the development of antibodies against the 
receptor recognition factor(s), including naturally raised and recombinantiy 
prepared antibodies. For example, the antibodies could be used to screen 

10 expression libraries to obtain the gene or genes that encode the receptor 
recognition factor(s). Such antibodies could include both polyclonal and 
monoclonal antibodies prepared by known genetic techniques, as well as bi- 
specific (chimeric) antibodies, and antibodies including other functionalities suiting 
them for additional diagnostic use conjunctive with their capability of modulating 

15 transcriptional activity. 

In particular, antibodies against specifically phosphorylated factors can be selected 
and are included within the scope of the present invention for their particular 
ability in following activated protein. Thus, activity of the recognition factors or 
20 of the specific polypeptides believed to be causally connected thereto may 

therefore be followed direcUy by the assay techniques discussed later on, through 
the use of an appropriately labeled quantity of the recognition factor or antibodies 
or analogs thereof, 

25 Thus, the receptor recognition factors, their analogs and/or analogs, and any 
antagonists or antibodies that may be raised thereto, are capable of use in 
connection with various diagnostic techniques, including immunoassays, such as a 
radioimmunoassay, using for example, an antibody to the receptor recognition 
factor that has been labeled by either radioactive addition, reduction with sodium 

30 borohydride, or radioiodination. 
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In an immunoassay, a control quantity of the antagonists or antibodies thereto, or 
the Uke may be prepared and labeled with an enzyme, a specific binding partner 
and/or a radioactive element, and may then be introduced into a ceUular sample. 
After the labeled material or its binding partner(s) has had an opportunity to react 
with sites within the sample, the resulting mass may be examined by known 
techniques, which may vary with the nature of the label attached. For example, 
antibodies against specificaUy phosphorylated factors may be selected and 
appropriately employed in the exemplary assay protocol, for the purpose of 
following activated protein as described above. 
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In the instance where a radioactive label, such as the isotopes 'H, "C, "S, 
'^Cl, "Cr, "Co, "Co, *'Fe, *"I, "'I, and '"Re are used, known currently 
available counting procedures may be utilized. In tiie instance where the label is 
an enzyme, detection may be accomplished by any of the presently utilized 
15 colorimetric, spectt-ophotometiic, fluorospectrophotometric, amperometric or 
gasometric techniques known in the art. 

The present invention includes an assay system which may be prepared in the form 
of a test kit for the quantitative analysis of the extent of the presence of the 

20 recognition factors, or to identify drugs or otiier agents that may mimic or block 
tiieir activity. The system or test kit may comprise a labeled component prepared 
by one of the radioactive and/or enzymatic techniques discussed herein, coupling a 
label to the recognition factors, their agonists and/or antagonists, and one or more 
additional immunochemical reagents, at least one of which is a free or 

25 immobUized Ugand, capable either of binding with the labeled component, its 
binding partner, one of the components to be determined or their binding 
partner(s). 

In a further embodiment, tiie present invention relates to certain therapeutic 
30 metfiods which would be based upon the activity of tiie recognition factor(s), its 
(or their) subunits. or active fragments thereof, or upon agents or other drugs 
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determined to possess the same activity. A first therapeutic method is associated 
with the prevention of the manifestations of conditions causally related to or 
following from the binding activity of the recognition factor or its subunits, and 
comprises administering an agent capable of modulating the production and/or 
activity of the recognition factor or subunits thereof, either individually or in 
mixture with each other in an amount effective to prevent the development of 
those conditions in the host. For example, drugs or other binding partners to the 
receptor recognition/transcription factors or proteins may be administered to 
inhibit or potentiate transcriptional activity, as in the potentiation of interferon in 
cancer therapy. Also, the blockade of the action of specific tyrosine phosphatases 
in the dephosphorylation of activated (phosphorylated) reeognition/transcription 
factors or proteins presents a method for potentiating the activity of the receptor 
recognition factor or protein that would concomitantiy potentiate therapies based 
on receptor recognition factor/protein activation. 

More specifically, the tiierapeutic metiiod generally referred to herein could 
include the method for tiie treatment of various pathologies or other cellular 
dysfunctions and derangements by the administration of pharmaceutical 
compositions tiiat may comprise effective inhibitors or enhancers of activation of 
the recognition factor or its subunits, or otiier equally effective drugs developed 
for instance by a drug screening assay prepared and used in accordance with a 
fiirtiier aspect of tiie present invention. For example, drugs or otiier binding 
partners to the receptor recognition/transcription factor or proteins, as represented 
by SEQ ID NO:2, may be administered to inhibit or potentiate transcriptional 
activity, as in the potentiation of interferon in cancer tiierapy. Also, the blockade 
of tiie action of specific tyrosine phosphatases in the dephosphorylation of 
activated (phosphorylated) recognition/transcription factor or protein presents a 
metiiod for potentiating tiie activity of tiie receptor recognition factor or protein 
tiiat would concomitantiy potentiate therapies based on receptor recognition 
factor/protein activation. Correspondingly, Uie inhibition or blockade of tiie 
activation or binding of tiie recognition/transcription factor would affect MHC 
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Class n expression and consequently, would promote immunosuppression. 
Materials exhibiting this activity, as illustrated later on herein by staurosporine, 
may be useful in instances such as the treatment of autoimmune diseases and graft 
rejection, where a degree of immunosuppression is desirable. 

5 

In particular, the proteins of ISGF-3 whose sequences are presented in SEQ ED 
NOS:2, 4, 6, 8, 10 or 12 herein, their antibodies, agonists, antagonists, or active 
fragments thereof, could be prepared in pharmaceutical formulations for 
administration in instances wherein interferon therapy is appropriate, such as to 
10 treat chronic viral hepatitis, hairy cell leukemia, and for use of interferon in 
adjuvant therapy. The specificity of the receptor proteins hereof would make it 
possible to better manage the aftereffects of current interferon therapy, and would 



if^l thereby make it possible to apply interferon as a general antiviral agent. 

ill 



'.K 15 Accordingly, it is a principal object of the present invention to provide a receptor 

1^ recognition factor and its subunits in purified form that exhibits certain 



characteristics and activities associated with transcriptional promotion of cellular 
activity. 



20 It is a further object of the present invention to provide antibodies to the receptor 
recognition factor and its subunits, and methods for their preparation, including 
recombinant means. 

It is a further object of the present invention to provide a method for detecting the 
25 presence of the receptor recognition factor and its subunits in niammals in which 
invasive, spontaneous, or idiopathic pathological states are suspected to be present. 

It is a further object of the present invention to provide a method and associated 
assay system for screening substances such as drugs, agents and the like, 
30 potentially effective in either mimicking the activity or combating the adverse 
effects of the recognition factor and/or its subunits in mammals. 
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It is a still further object of the present invention to provide a method for the 
treatment of mammals to control the amount or activity of the recognition factor or 
subunits thereof, so as to alter the adverse consequences of such presence or 
activity, or where beneficial, to enhance such activity. 

It is a still further object of the present invention to provide a method for the 
treatment of mammals to control the amount or activity of the recognition factor or 
its subunits, so as to treat or avert the adverse consequences of invasive, 
spontaneous or idiopathic pathological states. 

It is a still further object of the present invention to provide pharmaceutical 
compositions for use in therapeutic methods which comprise or are based upon the 
recognition factor, its subunits, their binding partner(s), or upon agents or drugs 
that control the production, or that mimic or antagonize the activities of the 
recognition factors. 

Other objects and advantages will become apparent to those skilled in the art from 
a review of the ensuing description which proceeds with reference to the following 
illustrative drawings. 



FIGURE^-tlepicts the full receptor recognition factor nucleic acid sequence and 
the deduced amino acid sequence derived for the ISGF-3a gene defining the 113 
kD protein. The nucleotides are numbered from 1 to 2553 (S^Q ID NO:l), and 
the amino acids are numbered from 1 to 851 (SEQ ID NO:2). 



FIGURE 1^ depicts the full receptor recognition factor nucleic acid sequence and 
the deduced amino acid sequence derived for the ISGF-3a gene defining the 91 kD 
protein. The nucleotides are numbered from 1 to 3943 (SEQ ID NO: 3), and the 
amino acids are numbered from 1 to 750 (SEQ ID NO:4). 



BRIEF DESCRIPTION OF THE DRAWINGS 
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FIGURE^depicts the full receptor recognition factor nucleic acid sequence and 
the deduced amino acid sequence derived for the ISGF-3a gene defining the 84 kD 
protein. The nucleotides are numbered from 1 to 2166 (SEQ ID NO:5), and the 
amino acids are numbered from 1 to 712 (SEQ ID NO:6). 

5 

FIGURE 4 shows the purification of ISGF-3. The left-hand portion of the Figure 
shows the purification of ISGF-3 demonstrating the polypeptides present after the 
first oligonucleotide affinity column (lane 3) and two different preparations after ■ 
the final chromatography step (Lanes 1 and 2). The left most lane contains 
10 protein size markers (High molecular weight, Sigma). ISGF-3 component proteins 
are indicated as 113 kD, 91 kD, 84 kD, and 48 kD [Kessler et al., GENES & 
»| DEV., 4 (1990); Levy et al., THE EMBO. 7,, 9 (1990)], The right-hand portion 

1^1 of the Figure shows purified ISGF-3 from 2-3 x 10^^ cells was electroblotted to 

nitrocellulose after preparations 1 and 2 (Lanes 1 and 2) had been pooled and 
15 separated on a 7.5% SDS polyacrylamide gel. ISGF-3 component proteins are 
« indicated. The two lanes on the right represent protein markers (High molecular 

weight, and prestained markers, Sigma). 

Ill 

i?M FIGURE >S^enerally presents the results of Northern Blot analysis for the 91/84 

20 kD peptides. Figure 5a presents restriction maps for cDNA clones E4 (top map) 
and E3 (bottom map) showing DNA fragments that were radiolabeled as probes 
(probes A-D), Figure 5b comprises Northern blots of cytoplasmic HeLa RNA 
hybridized with the indicated probes. The 4.4 and 3. 1 KB species as well as the 
28S and 18S rRNA bands are indicated. 
25 , 

FIGURE 6 depicts the conjoint protein sequence of the 91 kD (SEQ ID NO:4) and 
84 kD (SEQ ID N0:6) proteins of ISGF-3. One letter amino acid code is shown 
for the open reading frame from clone E4, (encoding the 91 kD protein). The 84 
kD protein, encoded by a different cDNA (E3), has the identical sequence but 
30 terminates after amino acid 712, as indicated. Tryptic peptides tl9, tl3a, and tl3b 
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from the 91 kD protein are indicated. The sole recovered tryptic peptide from the 
84 kD protein, peptide t27, was wholly contained within peptide tl9 as indicated. 



FIGURE/?^ presents the results of Western blot and antibody shift analyses. 

a) Highly purified ISGF-3, fractionated on a 7.0% SDS polyacrylamide 
gel, was probed with antibodies a42 (amino acids 597-703); a55 (amino acids 
2-59); and a57 (amino acids 705-739) in a Western blot analysis. The silver 
stained part of the gel (lanes a, b, and c) illustrates the location of the ISGF-3 
component proteins and the purity of the material used in Western blot: Lane a) 
Silver stain of protein sample used in all the Western blot experiments (immune 
and preimmune). Lane b) "Material of equal purity to that shown in Fig. 4, for 
clearer identification of the ISGF-3 proteins. Lane c) Size protein markers 
indicated. 

b) Antibody interference of the ISGF-3 shift complex; Lane a) The 
complete ISGF-3 and the free ISGF-37 component shift with partially purified 
ISGF-3 are marked; Lane b) Competition with a 100 fold excess of cold ISRE 
oligonucleotide. Lane c) Shift complex after the addition of 1 ml of preimmune 
serum to a 12.5 fil shift reaction. Lanes d and e) - Shift complex after the addition 
of 1 /xl of a 1:10 dilution or 1 ml of undiluted a42 antiserum to a 12.5 /xl shift 
reaction. 

Methods : 

Antibodies a42, a55 and a57 were prepared by injecting approximately 500 mgm 
of a fusion protein prepared in E, coli using the GE3-3X vector [Smith et al., 
GENE, 67 (1988)]. Rabbits were bled after the second boost a^d serum prepared. 

For Western blots highly purified ISGF-3 was separated on a 7% SDS 
polyacrylamide gel and electroblotted to nitrocellulose. The filter was incubated in 
blocking buffer ("blotto"), cut into strips and probed with specific antiserum and 
preimmune antiserum diluted 1:500. The immune complexes were visualized with 
the aid of an ECL kit (Amersham). Shift analyses were performed as previously 
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described [Levy et al., GENES <ft DEV., 2 (1988); Levy et al., GENES <fe DEV., 
3 (1989)] in a 4.5% polyacrylamide gel. 

FIGURE 8 presents the fiill length amino acid sequence of 113 kD protein 
components of ISGF-3a (SEQ ID N0:2) and alignment of conserved amino acid 
sequences between the 113 kD and 91/84 kD proteins (SEQ ID NOS:4 AND 6). 

A. Polypeptide sequences (A-E) derived from protein micro- sequencing of 
purified 113 kD protein (see accompanying paper) are underlined. Based on 
peptide E, we designed a degenerate oligonucleotide, 

AAT/CACIGAA/GCCL\TGGAA/GATT/CATr (SEQ ID NO: 13), which was 
used to screen a cDNA library [Pine et la., MOL. CELL? BIOL., 10 (1990)] 
basically as described [Norman et al., CELL, 55 (1988)], Briefly, the degenerate 
oligonucleotides were labeled by 32P-7-ATP by polynucleotide kinase, 
hybridizations were carried out overnight at 40''C in 6 x SSTE (0.9 M NaCl, 60 
mM Tris-HCl [pH 7.9] 6mM EDTA), 0.1%SDS, 2mM NajPjOy, 6 mM KH2PO4 
in the presence of 100 mg/ml salmon sperm DNA sperm and 10 x Denhardt's 
solution [Maniatis et al., MOLECULAR CLONING; A LABORATORY MANUAL 
(Cold Spring Harbor Lab., 1982)]. The nitrocellulose filters then were washed 4 
X 10 min. with the same hybridization conditions without labeled probe and 
salmon sperm DNA. Autoradiography was carried out at -80'*C with intensifying 
screen for 48 hrs. A PGR product was obtained later by the same method 
described for the 91/84 kD sequences, by using oligonucleotides designed 
according polypeptide D and E. The sequence of this PGR product was identical 
to a region in clone fll. The full length of 113 kD protein contains 851 amino 
acids. Three major helices in the N-terminal region were predicted by the 
methods of both Chou and Fasman [Chou et al., ANN. REV. BIOCHEM., 47 
(1978)] and Gamier et al [Gamier et al., /. MOL. BIOL., 12 (1978)] and are 
shown in shadowed boxes. At the C-terminal end, a highly negative charged 
domain was found. All negative charged residues are blackened and positive 
charged residues shadowed. The five polypeptides that derived from protein 



21 

microscreening [Aebersold et al., PROC. NATL. ACAD. SCL USA, 87 (1987)] are 
underlined. 

B) Comparison of amino acid sequences of 113 kD and 91/84 kD protein 
shows a 42% identical amino acid residues in the overlapping 715 amino acid 

5 sequence shown. In the middle helix region four leucine and one valine heptad 
repeats were identified in both 113 and 91/84 kD protein (the last leucine in 91/84 
kD is not exactly preserved as heptad repeats). When a heligram structure was 
drawn this helix is amphipathic (not shown). Another notable feature of this 
comparison is several tyrosine residues that are conserved in both proteins near 

10 their ends. 



FIGURE 9 shows the in vitro transcription and translation of 113 kD and 91 kD 
iJl cDNA and a Northern blot analysis with 113 kD cDNA probe. 



a) The full length cDNA clones of 113 and 91 kD protein were 
f\l 15 transcribed in vitro and transcribed RNAs was translated in vitro with rabbit 

Q lenticulate lysate (Promega; conditions as described in the Promega protocol). 

^ The mRNA of BMV (Promega) was simultaneously translated as a protein size 

Q marker. The 113 cDNA yielded a translated product about 105 kD and the 91 

Q cDNA yielded a 86 kD product. 

20 b) When total cytoplasmic mRNAs isolated from superinduced HeLa cells 

were utilized, a single 4.8 KB mRNA band was observed with a cDNA probe 
coding for C-end of 113 kD protein in a Northern blot analysis [Nielsch et al., 
The EMBO. J. y 10 (1991)]. 

25 FIGURE 10(A) presents the results of Western blot analysis confirming the 

identity of the 113 kD protein. An antiserum raised against a polypeptide segment 
[Harlow et al., ANTIBODIES; A LABORATORY MANUAL (Cold Spring Harbor 
Lab., 1988)] from amino acid 500 to 650 of 113 kD protein recognized 
specifically a 113 kD protein in a protein Western blot analysis. The antiserum 

30 recognized a band both in a highly purified ISGF-3 fraction (> 10,000 fold) from 
DNA affinity chromatography and in the crude extracts prepared from 7 and a 



'gat? 

Q 
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IFN treated HeLa ceUs [Fu et al., P/JOC ACAD, SCI. USA, 87 (1990)], 

The antiserum was raised against a fusion protein [a cDNA fragment coding for 
part of 113 kD protein was inserted into pGEX-2T, a high expression vector in the 
E, coU [Smith et al., PROC. NATL. ACAD. SCI. USA, 83 (1986)] purified from 

coli [Smith et al., GENE, 67 (1988)], The female NZW rabbits were 
immunized with 1 mg fusion protein in Freund's adjuvant. Two subsequent boosts 
two weeks apart were carried out with 500 mg fusion protein. The Western blot 
was carried out with conditions described previously [Pine et al., MOL. CELL. 
BIOL., 10 (1990)], 
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Q FIGURE 10(B) presents the results of a mobility shift assay showing that the 

anti-113 antiserum affects the ISGF-3 shift complex. Preimmune serum or the 113 
III kD antiserum was added to shift reaction carried out as described [Fu et al. 

f PROC. NATL. ACAD. SCI. USA, 87 (1990); Kessler et al, GENES & DEV., 4, 

nil 15 (1990)] at room temperature for 20 min. then one-third of reaction material was 

1^1 loaded onto a 5% polyacrylamide gel. In addition unlabeled probe was included in 

one reaction to show specificity of the gel shift complexes. 



FIGURE 1 1 shows the results of experiments investigating the IFN-a dependent 
20 phosphorylation of 113, 91 and 84 kD proteins. Protein samples from cells 
treated in various ways after 60 min. exposure to ^^P04*^ were precipitated with 
antiserum to 113 kD protein. Lane 1, no treatment of cells; Lane 2, cells treated 
7 min. with IFN-cr, By comparison with the marker proteins labeled 200, 97.5, 
69 and 46 kD (kilo daltons), the P04'^ labeled proteins in the precipitate are seen 
25 to be 113 and 91 kD. Lane 3, cells treated with IFN-7 overnight (no 

phosphorylated proteins) and then (Lane 4) treated with IFN-a for 7 min. show 
heavier phosphorylation of 113, 91 and 84 kD. 
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FIGURE 12 is a chromatogram depicting the identification of phosphoamino acid. 
Phosphate labeled protein ofll3, 91or84kD size was hydrolyzed and 
chromatographed to reveal newly labeled phosphotyrosine. Cells untreated with 
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IFN showed only phosphoserine label. (P Ser = phosphoserine; P Thr = 
phosphothreonine; P Tyr = phosphotyrosine. 

FIGURE 13 depicts (A) the deduced amino acid sequence (SEQ ID N0:8) of and 
^ 5 -^^B^the DNA sequence (SEQ ID N0:7) encoding the murine 91 kD intraceUular 
receptor recognition factor. 

FIGURE 14 depicts (A) the deduced amino acid sequence (SEQ ID NO: 10) of and 
^ efe^e DNA sequence (SEQ ID NO:9) encoding the 13sfl intracellular receptor 
10 recognition factor. 

FIGURE 15 depicts (A) the deduced amino acid sequence (SEQ ID NO: 12) of and 
(]^) the DNA sequence (SEQ ID NO: 1 1) encoding the 19sf6 intracellular 
receptor recognition factor. 

15 

FIGURE 16. Determination of molecular weights of Stat91 and phospho Stat91 
by native gel analysis. 

A) Western blot analysis of fractions from affmity purification. Extracts from 
human FS2 fibroblasts treated with IFN-7 (Ext), the unbound fraction (Flow), the 

20 fraction washed with Buffer A0.2 (A0.2), and the bound fraction eluted with 
buffer A0.8(A0.8) were immunoblotted with anti-91T. 

B) Native gel analysis. Phosphorylated Stat91 (the AO. 8 fraction from A) and 
upphosphorylated Stat91 (the Flow fraction from A) were analyzed on 4.5%, 
5.5%, 6.5% and 7.5% native polyacrylamide gels followed by immunoblotting 

25 with anti-91T. The top of gels (TOP) and the migration positipn of bromophenol 
blue (BPB) are indicated. 

C) Ferguson plots. The relative mobUities (Rm) of the Stat91 and phospho Stat91 
were obtained from Figure IB (see Experimental Procedures). Closed circle: 
Chicken egg albumin (45kD); Cross: Bovine serum albumin, monomer (66 kD); 

30 Open square: Bovine serum albumin, dimer (132 kD); Open circle: Urease, trimer 
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(272 kD); Open triangle: Unphosphorylated Stat91; Closed triangle: 
Phosphorylated Stat9l. 

D) Detennination of molecular weights from the standard curve, T^c molecular 
weights of phosphorylated and unphosphorylated Stat91 proteins (indicated as 
5 dosed and open arrows, respectively) were obtained by extrapolation of their 
retardation coefficients. 

FIGURE 17 Determination of molecular weights by glycerol gradients. 

A) western blot analysis. Extracts from human Bud8 fibroblasts treated with IFN- 
^0 ^ (the rightmost lane) and every other fraction from fractionl6 to 34 were 

analyzed on 7.5% SDS-PAdE followed by immunobloting with anU-91T. The 
peak of phosphorylated Stat91 (fraction 20) and the peak of unphosphorylated 
Stat91 (fraction 30) were indicated by a closed and open arrow, respectively. 

B) MobUity shift analysis. Every other fractions from the gradients were 

15 analyzed. , 

C) Graphic representation of Ac data from A and B. Pealc fraction numbers of 
protein standard, are plotted versus their molecular weight. The position of pealcs 
(Of phosphorylated and unphosphorylated Sta,91 protein are indicated by the closed 
and open arrows, respectively. Standards are ferritin (Fer. 440 catalase (Cat. 

20 232 kD). ferritin half unit (Fer 1/2, 220 IcD), aldolase (Aid, 158 kD), bovine 
serum albumin (BSA, 68 kD). 

FIGURE 18. Stat91 in cell extracts binds DNA as a dimer. 

A) wester blot analysis. Extracts from stable cell lines expressing either Stat84 
25 (C84). or Stat91L (C91L) or both (Cmx) were analyzed on 7.5,% SDS-PAGE 

followed by immunobloting with anti-91. 

B) Gel mobility shift analysis. ExtracB from stable ceU lines (Fig 3A) untreated 
(.) or treated wid, IFN-7(+) were analyzed. The positions of S,at91 homodimer 
(91L) Stat84 homodimer (84), and the heterodimer (84*91) are indicated. 
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FIGURE 19. Formation of herterodimer by denaturation and renaturation. 
Cytoplasmic (Left Panel) or nuclear extracts (Right Panel) from IFN-7-treated cell 
lines expressing either Stat84 (C84) or Stat91 (C91) were analyzed by gel mobility 
shift assays. +: with addition; -: without addition; D/R: samples were subjected 
to guanidinium hydrochloride denaturation and renaturation treatment. 

FIGURE 20. Diagramatic representation of dissociation and reassociation 
analysis. 

FIGURE 21. Dissociation-reassociation analysis with peptides. Gel mobility shift 
analysis with IFN-7 treated nuclear extracts from cell lines expressing Stat91L 
(C91L, lane 15) or Stat84 (C84, lane 14) or mixture of both (lane 1-13, 16-18) in 
the presence of increasing concentrations of various peptides. 91-Y, 
unphosphorylated peptide from Stat91 (LDGPKGTGYIKTELI) (SEQ. ID 
NO.: 18); 91Y-p, phosphotyrosyl peptide from Stat91 (GY*IKTE) (SEQ ID 
NO.: 19); 113Y-p, phosphotyrosyl peptide with high binding affinity to Src SH2 
domain (EPQY*EEIPIYL, Songyang et al., 1993, Cell 72:767-778) (SEQ. ID 
NO.:21). Final concentrations of peptides added: 1 fiM (lane 8), 4 piM (lane 2,5, 
11), 10 /xM Gane 9), 40 fiM (lane 3, 6, 10, 12, 14-18), 160 fiM (lane 4, 7, 13). 
4-: with addition; -: without addition. Right panel: antiserum tests for identity 
of gel-shift bands (see Figure 3). 

FIGURE 22. Dissociation-reassociation analysis with GST fusion proteins. A) 
SDS-PAGE (12%) analysis of purified GST fusion proteins as visualized by 
Commasie blue. GST-91 SH3, native SH2 domain of Stat91; GST-91 mSH2, R*^^ 
to mutant; GST-91 SH3, SH3 domain of Stat91; GST Src SH2, the SH2 
domain of src protein. Same amounts (1 fig) of each fusion proteins were loaded. 
Protein markers were run in lane 1 as indicated. 

B) Dissociation-reassociation analysis similar to Figure 6. Dissociating agents 
were GST fusion proteins purified from bacterial expression as shown above. 
Final concentrations of fusion proteins added are 0.5 iiM (lanes 2, 5, 8, 11, 14), 
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2.5 AtM Oanes 3, 6, 9, 12. 15) and 5 piM (lanes 4, 7, 10, 13, 17, 18). +: with 
addition; -: without addition; FP: fusion proteins, 

FIGURE 23, Comparison of Stat91 SHI structure with known SH2 structures. 
The Stat91 sequence is disclosed herein (SEQ ID NO:4), The structures used for 
the other SH2s are Src (Waksman et al., 1992, Nature 358:646-653) (SEQ ID 
NO:22), Abl (Overduin et al., 1992, Proc. Natl. Acad. Sci, USA 89:11673-77 and 

1992, CeU 70:697-704) (SEQ ED NO:23, Lck (Eck et al., 1993, Nature 362:87- 
91) (SEQ ID NO:24), and p85aN (Booker et al., 1992, Nature 358:684-687) 
(SEQ ID NO:25). The alignment of the determined structures is by direct 
coordinate superimposition of the backbone structures. The names of secondary 
structural features and significant residues is based on the scheme of Eck et al., 

1993. The boundaries and extents of the structure features are indicated by [ — ]. 
The starting numbers for the parent sequences are shown in parentheses. 
Experimentally determined structurally conserved regions are from Src, p85cr, and 
Abl (Cowbum, unpublished). The root mean square deviation of three- 
dimensionally aligned structures differs by less than 1 Angstrom for the backbone 
non-hydrogen atoms in the sections marked by the XXX. 

DETAILED DESCRIPTION 

In accordance with the present invention there may be employed conventional 
molecular biology, microbiology, and recombinant DNA techniques within the 
skill of the art. Such techniques are explained fully in the literature. See, e.g., 
Maniatis, Fritsch & Sambrook, "Molecular Cloning: A Laboratory Manual" 
(1982); "DNA Cloning: A Practical Approach," Volumes I and n (D.N. Glover 
ed. 1985); "Oligonucleotide Synthesis" (M.J, Gait ed. 1984); "Nucleic Acid 
Hybridization" [B.D. Hames & S.J, Higgins eds. (1985)]; "Transcription And 
Translation" [B.D. Hames & S.J. Higgins, eds. (1984)]; "Animal Cell Culture" 
[R.I. Freshney, ed. (1986)]; "Immobilized Cells And Enzymes" [IRL Press, 
(1986)]; B. Perbal, "A Practical Guide To Molecular Cloning" (1984). 
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Therefore, if appearing herein, the following terms shall have the definitions set 
out below. 



The terms "receptor recognition factor", "receptor recognition-tyrosine kinase 
5 factor", "receptor recognition factor/tyrosine kinase substrate", "receptor 
recognition/transcription factor", "recognition factor" and "recognition factor 
protein(s)" and any variants not specifically listed, may be used herein 
interchangeably, and as used throughout the present application and claims refer to 
proteinaceous material including single or multiple proteins, and extends to those 
10 proteins having the amino acid sequence data described herein and presented in 
FIGURE 1 (SEQ ID NO:2);' FIGURE 2 (SEQ ID NO:4)"and in FIGURE 3 (SEQ 
ID NO: 6), and the profile of activities set forth herein and in the Claims. 
Accordingly, proteins displaying substantially equivalent or altered activity are 
likewise contemplated. These modifications may be deliberate, for example, such 
15 as modifications obtained through site-directed mutagenesis, or may be accidental, 
such as those obtained through mutations in hosts that are producers of the 
Q complex or its named subunits. Also, the terms "receptor recognition factor", 

i?y "recognition factor" and "recognition factor protein(s)" are intended to include 

within their scope proteins specifically recited herein as weU as all substantially 



Q 20 homologous analogs and allelic variations 



The amino acid residues described herein are preferred to be in the "L" isomeric 
form. However, residues in the "D" isomeric form can be substituted for any L- 
amino acid residue, as long as the desired fuctional property of immunoglobulin- 
25 binding is retained by the polypeptide. NH2 refers to the free amino group 

present at the amino terminus of a polypeptide. COOH refers to the free carboxy 
group present at the carboxy terminus of a polypeptide. In keeping with standard 
polypeptide nomenclature, 7. Biol Chem,, 243:3552-59 (1969), abbreviations for 
amino acid residues are shown in the following Table of Correspondence: 
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TABLE OF CORRESPONDENCE 



SYMBOL 




AMTNO ACro 


\-um 


3-L?tt?r 




Y 


Tyr 


tyrosine 


G 


Gly 


glycine 


F 


Phe 


phenylalanine 


M 


Met 


methionine 


A 


Ala 


alanine 


S 


Ser 


serine 


I 


De 


isoleucine 


L 


Leu 


leucine 


T 


Thr 


threonine 


V 


Val 


valine 


P 


Pro 


proline 


K 


Lys 


lysine 


H 


His 


histidine 


Q 


Gin 


glutamine 


E 


Glu 


glutamic acid 


W 


Trp 


tryptophan 


R 


Arg 


arginine 


D 


Asp 


aspartic acid 


N 


Asn 


asparagine 


C 


Cys 


cysteine 



25 It should be noted that all amino-acid residue sequences are represented herein by 
formulae whose left and right orientation is in the conventional direction of amino- 
terminus to carboxy-terminus. Furthermore, it should be noted that a dash at the 
beginning or end of an amino acid residue sequence indicates a peptide bond to a 
further sequence of one or more amino-acid residues. The above Table is 

30 presented to correlate the three-letter and one-letter notations which may appear 
alternately herein. 
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A "replicon" is any genetic element (e.g., plasmid, chromosome, virus) that 
functions as an autonomous unit of DNA replication in vivo; i.e., capable of 
replication under its own control. 

5 A "vector" is a replicon, such as plasmid, phage or cosmid, to which another 
DNA segment may be attached so as to bring about the replication of the attached 
segment. 

A "DNA molecule" refers to the polymeric form of deoxyribonucleotides (adenine, 
10 guanine, thymine, or cytosine) in its either single stranded form, or a double- 
stranded helix. This term refers only to the primary and secondary structure of 
the molecule, and does not limit it to any particular tertiary forms. Thus, this 
term includes double-stranded DNA found, inter alia, in linear DNA molecules 
(e.g., restriction fragments), viruses, plasmids, and chromosomes. In discussing 
15 the structure of particular double-stranded DNA molecules, sequences may be 

described herein according to the normal convention of giving only the sequence in 
the 5' to 3* direction along the nontranscribed strand of DNA (i.e., the strand 
having a sequence homologous to the mRNA). 

20 An "origin of replication" refers to those DNA sequences that participate in DNA 
synthesis. 

A DNA "coding sequence" is a double-stranded DNA sequence which is 
transcribed and translated into a polypeptide in vivo when placed under the control 

25 of appropriate regulatory sequences. The boundaries of the coding sequence are 
determined by a start codon at the 5* (amino) terminus and a translation stop 
codon at the 3' (carboxyl) terminus. A coding sequence can include, but is not 
limited to, prokaryotic sequences, cDNA from eukaryotic mRNA, genomic DNA 
sequences from eukaryotic (e.g., mammalian) DNA, and even synthetic DNA 

30 sequences. A polyadenylation signal and transcription termination sequence will 
usually be located 3' to the coding sequence. 
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Transcriptional and translational control sequences are DNA regulatory sequences, 
such as promoters, enhancers, polyadenylation signals, terminators, and the like, 
that provide for the expression of a coding sequence in a host cell. 



5 A "promoter sequence" is a DNA regulatory region capable of binding RNA 
polymerase in a cell and initiating transcription of a downstream (3' direction) 
coding sequence. For purposes of defming the present invention, the promoter 
sequence is bounded at its 3* terminus by the transcription initiation site and 
extends upstream (5* direction) to include the minimum number of bases or 
10 elements necessary to initiate transcription at levels detectable above background. 
Within the promoter sequence will be found a transcription initiation site 
(conveniently defined by mapping with nuclease SI), as well as protein binding 
domains (consensus sequences) responsible for the binding of RNA polymerase. 
Eukaryotic promoters will often, but not always, contain "TATA" boxes and 

m 

4J 15 "CAT" boxes. Prokaryotic promoters contain Shine-Dalgamo sequences m 

addition to the -10 and -35 consensus sequences. 
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An "expression control sequence" is a DNA sequence that controls and regulates 
the transcription and translation of another DNA sequence. A coding sequence is 
O 20 "under the control" of transcriptional and translational control sequences in a cell 

when RNA polymerase transcribes the coding sequence into mRNA, which is then 
translated into the protein encoded by the coding sequence. 



A "signal sequence" can be included before the coding sequence. This sequence 
25 encodes a signal peptide, N-terminal to the polypeptide, that communicates to the 
host cell to direct the polypeptide to the cell surface or secrete the polypeptide into 
the media, and this signal peptide is clipped off by the host cell before the protein 
leaves the cell. Signal sequences can be found associated with a variety of 
proteins native to prokaryotes and eukaryotes. 
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The term "oligonucleotide", as used herein in referring to the probe of the present 
invention, is defined as a molecule comprised of two or more ribonucleotides, 
preferably more than three. Its exact size will depend upon many factors which, 
in turn, depend upxjn the ultimate function and use of the oligonucleotide, 

5 

The term "primer" as used herein refers to an oligonucleotide, whether occurring 
naturally as in a purified restriction digest or produced synthetically, which is 
capable of acting as a point of initiation of synthesis when placed under conditions 
in which synthesis of a primer extension product, which is complementary to a 
10 nucleic acid strand, is induced, i.e., in the presence of nucleotides and an inducing 
agent such as a DNA polymerase and at a suitable temperature and pH. The 
primer may be either single-stranded or double-stranded and must be sufficienUy 
long to prime the synthesis of the desired extension product in the presence of the 
^ inducing agent. The exact length of the primer will depend upon many factors, 

4- 15 including temperature, source of primer and use of the method. For example, for 
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diagnostic applications, depending on the complexity of the target sequence, the 
oligonucleotide primer typically contains 15-25 or more nucleotides, although it 
may contain fewer nucleotides. 

20 The primers herein are selected to be "substantially" complementary to different 
strands of a particular target DNA sequence. This means that the primers must be 
sufficienUy complementary to hybridize with their respective strands. Therefore, 
the primer sequence need not reflect the exact sequence of the template. For 
example, a non-complementary nucleotide fragment may be attached to the 5' end 

25 of the primer, with the remainder of the primer sequence being complementary to 
the strand. Alternatively, non-complementary bases or longer sequences can be 
interspersed into the primer, provided that the primer sequence has sufficient 
complementarity with the sequence of the strand to hybridize therewith and 
thereby form the template for the synthesis of the extension product. 

30 
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As used herein, the terms "restriction endonucleases" and "restriction enzymes" 
refer to bacterial enzymes, each of which cut double-stranded DNA at or near a 
specific nucleotide sequence. 

5 A cell has been "transformed" by exogenous or heterologous DNA when such 
DNA has been introduced inside the cell. The transforming DNA may or may not 
be integrated (covalently linked) into chromosomal DNA making up the genome of 
the cell. In prokaryotes, yeast, and mammalian cells for example, the 
transforming DNA may be maintained on an episomal element such as a plasmid. 
10 With respect to eukaryotic cells, a stably transformed cell is one in which the 

transforming DNA has become integrated into a chromosome so that it is inherited 
P by daughter cells through chromosome replication. This stability is demonstrated 

by the ability of the eukaryotic cell to establish cell lines or clones comprised of a 

p'\ 

population of daughter cells containing the transforming DNA. A "clone" is a 
15 population of cells derived from a single cell or common ancestor by mitosis. A 

I? jut 

ffll "cell line" is a clone of a primary cell that is capable of stable growth in vitro for 

Q many generations. 

m 

fp Two DNA sequences are "substantially homologous" when at least about 75% 

Q 20 (preferably at least about 80%, and most preferably at least about 90 or 95%) of 

the nucleotides match over the defined length of the DNA sequences. Sequences 
that are substantially homologous can be identified by comparing the sequences 
using standard software available in sequence data banks, or in a Southern 
hybridization experiment under, for example, stringent conditions as defined for 
25 that particular system. Defining appropriate hybridization conditions is within the 
skill of the art. See, e.g., Maniatis et al., supra; DNA Cloning, Vols, I & H, 
supra; Nucleic Acid Hybridization, supra. 
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A "heterologous" region of the DNA construct is an identifiable segment of DNA 
within a larger DNA molecule that is not found in association with the larger 
molecule in nature. Thus, when the heterologous region encodes a mammalian 



.f»4 
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gene, the gene wiU usuaUy be flanked by DNA that does not flank the mammaUan 
genomic DNA in the genome of the source organism. Another example of a 
heterologous coding sequence is a construct where the coding sequence itself is not 
found in nature (e.g., a cDNA where the genomic coding sequence contains 
5 introns, or synthetic sequences having codons different than the native gene). 
Allelic variations or naturally-occurring mutational events do not give rise to a 
heterologous region of DNA as defined herein. 

An "antibody" is any immunoglobulin, including antibodies and fragments thereof, 
10 that binds a specific epitope. The term encompasses polyclonal, monoclonal, and 
chimeric antibodies, the last mentioned described in further detail in U.S. Patent 
Nos. 4,816,397 and 4,816,567. 

An "antibody combining site" is that structural portion of an antibody molecule 
15 comprised of heavy and light chain variable and hypervariable regions that 
specifically binds antigen. 



The phrase "antibody molecule" in its various grammatical forms as used herein 
contemplates both an intact immunoglobulin molecule and an immunologically 
p 20 active portion of an immunoglobulin molecule. 

Exemplary antibody molecules are intact immunoglobulin molecules, substantially 
intact immunoglobulin molecules and those portions of an immunoglobulin 
molecule that contains the paratope, including those portions known in the art as 
25 Fab, Fab', F(ab')2 and F(v), which portions are preferred for u?e in the 
therapeutic methods described herein. 

Fab and F(ab')2 portions of antibody molecules are prepared by the proteolytic 
reaction of papain and pepsin, respectively, on substantially intact antibody 
30 molecules by methods that are weU-known. See for example, U.S. Patent No. 
4,342,566 to Theofilopolous et al. Fab' antibody molecule portions are also well- 




known and are produced from F(ab*)2 portions followed by reduction of the 
disulfide bonds linking the two heavy chain portions as with mercaptoethanol, and 
followed by alkylation of the resulting protein mercaptan with a reagent such as 
iodoacetamide. An antibody containing intact antibody molecules is preferred 
5 herein. 



The phrase "monoclonal antibody" in its various grammatical forms refers to an 
antibody having only one species of antibody combining site capable of 
immunoreacting with a particular antigen. A monoclonal antibody thus typically 
10 displays a single binding affinity for any antigen with which it immunoreacts. A 
monoclonal antibody may therefore contain an antibody molecule having a 
plurality of antibody combining sites, each immunospecific for a different antigen; 
e.g., a bispecific (chimeric) monoclonal antibody. 

m 

X 15 The phrase "pharmaceutically acceptable" refers to molecular entities and 
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compositions that are physiologically tolerable and do not typically produce an 
allergic or similar untoward reaction, such as gastric upset, dizziness and the like, 
when administered to a human. 



Q 20 The phrase "therapeutically effective amount" is used herein to mean an amount 



sufficient to prevent, and preferably reduce by at least about 30 percent, more 
preferably by at least 50 percent, most preferably by at least 90 percent, a 
clinically significant change in the S phase activity of a target cellular mass, or 
other feature of pathology such as for example, elevated blood pressure, fever or 
25 white cell count as may attend its presence and activity. 

A DNA sequence is "operatively linked" to an expression control sequence when 
the expression control sequence controls and regulates the transcription and 
translation of that DNA sequence. The term "operatively linked" includes having 
30 an appropriate start signal (e.g., ATG) in front of the DNA sequence to be 

expressed and maintaining the correct reading frame to permit expression of the 




DNA sequence under the control of the expression control sequence and 
production of the desired product encoded by the DNA sequence. If a gene that 
one desires to insert into a recombinant DNA molecule does not contain an 
appropriate start signal, such a start signal can be inserted in front of the gene. 

The term "standard hybridization conditions" refers to salt and temperature 
conditions substantially equivalent to 5 x SSC and 65**C for both hybridization and 
wash. 
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10 In its primary aspect, the present invention concerns the identification of a 
receptor recognition factor, and the isolation and sequencing of a particular 
receptor recognition factor protein, that is believed to be present in cytoplasm and 
that serves as a signal transducer between a particular cellular receptor having 
bound thereto an equally specific polypeptide ligand, and the comparably specific 
^ 15 transcription factor that enters the nucleus of the cell and interacts with a specific 

J^; DNA binding site for the activation of the gene to promote the predetermined 

' ' response to the particular polypeptide stimulus. The present disclosure confirms 

€1 that specific and individual receptor recognition factors exist that correspond to 

fij known stimuli such as tumor necrosis factor, nerve growth factor, platelet-denved 

h\ 

20 growth factor and the like. Specific evidence of this is set forth herein with 

III respect to the interferons a and y (IFNa and IFN7). 

A further property of the receptor recognition factors (also termed herein signal 
transducers and activators of transcription — STAT) is dimerization to form 

25 homodimers or heterodimers upon activation by phosphorylation of tyrosine. In a 
specific embodiment, infray Stat91 and Stat84 form homodimers and a Stat91- 
Stat84 heterodimer. Accordingly, the present invention is directed to such dimers, 
which can form spontaneously by phophorylation of the STAT protein, or which 
can be prepared synthetically by chemically cross-linking two like or unlike STAT 

30 proteins. 
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The present receptor recognition factor is likewise noteworthy in that it appears 
not to be demonstrably affected by fluctuations in second messenger activity and 
concentration. The receptor recognition factor proteins appear to act as a substrate 
for tyrosine kinase domains, however do not appear to interact with G-proteins, 
and therefore do not appear to be second messengers. 

A particular receptor recognition factor identified herein by SEQ ID NO:4, has 
been determined to be present in cytoplasm and serves as a signal transducer and a 
specifice transcription factor in response to IFN-7 stimulation that enters the 
nucleus of the cell and interacts directly with a specific DNA binding site for the 
activation of the gene to promote the predetermined response to the particular 
polypeptide stimulus. This particular factor also acts as a translation protein and, 
in particular, as a DNA binding protein in response to interferon-7 stimulation. 
This factor is likewise noteworthy in that it has the following characteristics: 

a) It interacts with an interferon-7-bound receptor kinase complex; 

b) It is a tyrosine kinase substrate; and 

c) When phosphorylated, it serves as a DNA binding protein. 

More particularly, the factor of SEQ ID NO: 4 directly interacts with DNA after 
acquiring phosphate on tyrosine located at position 701 of the amino acid 
sequence. Also, interferon-7-dependent activation of this factor occurs without 
new protein synthesis and appears within minutes of interferon-7 treatment, 
achieves maximum extent between 15 and 30 minutes thereafter, and then 
disappears after 2-3 hours. 

In a particular embodiment, the present invention relates to all inembers of the 
herein disclosed family of receptor recognition factors except the 91 kD protein 
factors, specifically the proteins whose sequences are represented by one or more 
of SEQ ID NO:4, SEQ ID NO:6 or SEQ ID NO:8. 
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Subsequent to the filing of the initial applications directed to the present invention, 
the inventors have termed each member of the family of receptor recognition 
factors as a signal transducer and activator of franscription (STAT) protein. Each 
STAT protein is designated by the apparent molecular weight (e.g., Statll3, 
Stat91, Stat84, etc.), or by the order in which it has been identified (e.g., Statla 
[Stat91], Statl/3 [Stat84], Stat2 [StatllS], StatS [a murine protein described in 
U.S. Application Serial No. 08/126,588, filed September 24, 1993 as 19sf6], and 
Stat4 [a murine STAT protein described in U.S. Application Serial No. 
08/126,588, filed September 24, 1993 as 13sfl]). As will be readily appreciated 
by one of ordinary skill in the art, the choice of name has no effect on the 
intrinsic characteristics of the factors described herein, which were first disclosed 
in U.S. Application Serial No. 07/845,296, filed March 19, 1992. The present 
inventors have chosen to adopt this newly derived terminology herein as a 
convenience to the skilled artisan who is familiar with the subsequently published 
papers relating to the same, and in accordance with the proposal to harmonize the 
terminology for the novel class of proteins, and nucleic acids encoding the 
proteins, disclosed by the instant inventors. The terms [molecular weight] kd 
receptor recognition factor, Stat[molecular weight], and Stat[number] are used 
herein interchangeably, and have the meanings given above. For example, the 
terms 91 kd protein, Stat91, and Statla refer to the same protein, and in the 
appropriate context refer to the nucleic acid molecule encoding such protein. 

As-^Uited above, th e piej>eut invention also rel a tes to a recombinant -BNATnolegule-- 
or cloned gene, or a degenerate variant thereof, which encode§...a-T^eptor 
recognition factor, or a fragment thereof, that po§5essgsa molecular weight of 
about 113 kD and an amino acid sequejiccrset forth in FIGURfe 1 (SEQ ID NO:2); 
preferably a nucleic acid molpctrteTin particular a recombinant DNA molecule or 
cloned gene, encoding-^Ke 113 kD receptor recognition factor has a nucleotide 
sequence orij-c6mplementary to a DNA sequence shown in FIGURE 1 (SEQ ID 
NO^p,--^ another embodiment, the receptor recognition factor has a molecular 
Woi^ht of nhnnf 0 1 VD nnH thp -iminn nriH rp^gimnr ^ r ^t fr^rfh in pyrfxypp 2 (RVQ 
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ED NO:4) or FIGURS^13 (SEQ ED NO:8); preferably a nucleic acid moj^ule, in 
^ „_icular a recombinant DNA molecule or cloned gene, encoding th^l kD 
/> 01/ receptor recognition factor has a nucleotide sequence or is complementary to a 
DNA seqfiec« shown in FIGURE 2 (SEQ ID NO:3) or FIGUJUE 13 (SEQ ID 
NO: 8). In yet a further embodiment, the receptor recogniaon factor has a 
molecular weight of about 84 kD and the amino acid s^uence set forth in 
FIGURE 3 (SEQ ED NO: 6); preferably a nucleic ae^ molecule, in particular a 
recombinant DNA molecule or cloned gene, enp^ing the 84 kD receptor 
recognition factor has a nucleotide sequence/^ is complementary to a DNA 
10 sl^^ee^h^n in FIGURE 3 (SEQ ED Nd:5). In yet another embodiment, the 
receptor recognition factor has an anurfo acid sequence set forth in FIGURE 14 
(SEQ ID NO: 10); preferably a nucfeic acid molecule, in4)articular a recombinant 
DNA molecule or cloned gene,>OTCoding such receptor recognition factor has a 
nucleotide sequence or is conlplementary to a DNA aeqnoce shown in FIGURE 14 
15 (SEQ ID NO:9)- In still smother embodiment, the receptor recognition factor has 
an amino acid sequence^t forth in FIGURE 15 (SEQ ID NO: 12); preferably a 
nucleic acid molecul^ in particular a recombinant DNA molecule or cloned gene, 
encoding such r^^ptor recognition factor has a nucleotide sequence or is 
6^ complementan/to a DNA^^I^?sh^n in FIGURE 15 (SEQ ID NO: 11). 
20 / 

The possibilities both diagnostic and therapeutic that are raised by the existence of 
the receptor recognition factor or factors, derive from the fact that the factors 
appear to participate in direct and causal protein-protein interaction between the 
receptor that is occupied by its ligand, and those factors that thereafter directly 
25 interface with the gene and effect transcription and accordingly gene activation. 
As suggested earlier and elaborated further on herein, the present invention 
contemplates pharmaceutical intervention in the cascade of reactions in which the 
receptor recognition factor is implicated, to modulate the activity initiated by the 
stimulus bound to the cellular receptor. 
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Thus, in instances where it is desired to reduce or inhibit the gene activity 
resulting from a particular stimulus or factor, an appropriate inhibitor of the 
receptor recognition factor could be introduced to block the interaction of the 
receptor recognition factor with those factors causally connected with gene 
5 activation. Correspondingly, instances where insufficient gene activation is taking 
place could be remedied by the introduction of additional quantities of the receptor 
recognition factor or its chemical or pharmaceutical cognates, analogs, fragments 
and the like. 



10 As discussed earlier, the recognition factors or their binding partners or other 
ligands or agents exhibiting either mimicry or antagonism to the recognition 
factors or control over their production, may be prepared in pharmaceutical 
compositions, witii a suitable carrier and at a strength effective for administration 
by various means to a patient experiencing an adverse medical condition associated 
15 specific transcriptional stimulation for the treatment thereof. A variety of 

administrative techniques may be utilized, among them parenteral techniques such 
as subcutaneous, intravenous and intraperitoneal injections, catheterizations and the 
like. Average quantities of the recognition factors or their subunits may vary and 
in particular should be based upon the recommendations and prescription of a 
qualified physician or veterinarian. 



Also, antibodies including both polyclonal and monoclonal antibodies, and drugs 
that modulate the production or activity of the recognition factors and/or their 
subunits may possess certain diagnostic applications and may for example, be 
utilized for the purpose of detecting and/or measuring conditions such as viral 
infection or the like. For example, the recognition factor or it^ subunits may be 
used to produce both polyclonal and monoclonal antibodies to themselves in a 
variety of cellular media, by known techniques such as the hybridoma technique 
utilizing, for example, fused mouse spleen lymphocytes and myeloma cells. 
Likewise, small molecules that mimic or antagonize the activity(ies) of the 
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receptor recognition factors of the invention may be discovered or synthesized, 
and may be used in diagnostic and/or therapeutic protocols. 



The general methodology for making monoclonal antibodies by hybridomas is well 
5 known. Immortal, antibody-producing cell lines can also be created by techniques 
other than fusion, such as direct transformation of B lymphocytes with oncogenic 
DNA, or transfection with Epstein-Barr virus. See, e.g., M. Schreier et al., 
"Hybridoma Techniques" (1980); Hammerling et al., "Monoclonal Antibodies And 
T-ceU Hybridomas" (1981); Kennett et al., "Monoclonal Antibodies" (1980); see 
10 also U.S. Patem Nos. 4,341,761; 4,399,121; 4,427,783; 4,444,887; 4,451,570; 
4,466,917; 4,472,500; 4,491-,632; 4,493,890. 

Panels of monoclonal antibodies produced against recognition factor peptides can 
be screened for various properties; i.e., isotype, epitope, affinity, etc. Of 
15 particular interest are monoclonal antibodies that neutralize the activity of the 
recognition factor or its subunits. Such monoclonals can be readily identified in 
recognition factor activity assays. High affinity antibodies are also useful when 
immunoaffinity purification of native or recombinant recognition factor is possible. 

20 Preferably, the anti-recognition factor antibody used in the diagnostic methods of 
this invention is an affinity purified polyclonal antibody. More preferably, the 
antibody is a monoclonal antibody (mAb). In addition, it is preferable for the 
anti- recognition factor antibody molecules used herein be in the form of Fab, 
Fab', F(ab')2 or F(v) portions of whole antibody molecules. 

25 

i 

As suggested earlier, the diagnostic method of the present invention comprises 
examining a cellular sample or medium by means of an assay including an 
effective amount of an antagonist to a receptor recognition factor/protein, such as 
an anti-recognition factor antibody, preferably an affinity-purified polyclonal 
30 antibody, and more preferably a mAb. In addition, it is preferable for the anti- 
recognition factor antibody molecules used herein be in the form of Fab, Fab', 




F(ab')2 or F(v) portions or whole antibody molecules. As previously discussed, 
patients capable of benefiting from this method include those suffering from 
cancer, a pre-cancerous lesion, a viral infection or other like pathological 
derangement. Methods for isolating the recognition factor and inducing anti- 
recognition factor antibodies and for determining and optimizing the ability of anti- 
recognition factor antibodies to assist in the examination of the target cells are all 
well-known in the art. 

Methods for producing polyclonal anti-polypeptide antibodies are well-known in 
the art. See U.S. Patent No. 4,493,795 to Nestor et al. A monoclonal antibody, 
typically containing Fab and/or F(ab*)2 portions of useful antibody molecules, can 
be prepared using the hybridoma technology described in Antibodies - A 
Laboratory Manual, Harlow and Lane, eds., Cold Spring Harbor Laboratory, New 
York (1988), which is incorporated herein by reference. Briefly, to form the 
hybridoma from which the monoclonal antibody composition is produced, a 
myeloma or other self-perpetuating cell line is fused with lymphocytes obtained 
from the spleen of a mammal hyperimmunized with a recognition factor-binding 
portion thereof, or recognition factor, or an origin-specific DNA-binding portion 
thereof. 

Splenocytes are typically fused with myeloma cells using polyethylene glycol 
(PEG) 6000. Fused hybrids are selected by their sensitivity to HAT. Hybridomas 
producing a monoclonal antibody useful in practicing this invention are identified 
by their ability to immunoreact with the present recognition factor and their ability 
to inhibit specified transcriptional activity in target cells. 

A monoclonal antibody useful in practicing the present invention can be produced 
by initiating a monoclonal hybridoma culture comprising a nutrient medium 
containing a hybridoma that secretes antibody molecules of the appropriate antigen 
specificity. The culture is maintained under conditions and for a time period 
sufficient for the hybridoma to secrete the antibody molecules into the medium. 
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The antibody-containing medium is then collected. The antibody molecules can 
then be further isolated by well-known techniques. 

Media useful for the preparation of these compositions are both well-known in the 
art and commercially available and include synthetic culture media, inbred mice 
and the like. An exemplary synthetic medium is Dulbecco's minimal essential 
medium (DMEM; Dulbecco et al., Virol. 8:396 (1959)) supplemented with 4,5 
gm/1 glucose, 20 mm glutamine, and 20% fetal calf serum. An exemplary inbred 
mouse strain is the Balb/c. 

Methods for producing monoclonal anti-recognition factof antibodies are also well- 
known in the art. See Niman et al., Proc. Nail. Acad. ScL USA, 80:4949-4953 
(1983). Typically, the present recognition factor or a peptide analog is used either 
alone or conjugated to an immunogenic carrier, as the immunogen in the before 
described procedure for producing anti-recognition factor monoclonal "antibodies. 
The hybridomas are screened for the ability to produce an antibody that 
immunoreacts with the recognition factor peptide analog and the present 
recognition factor. 

The present invention further contemplates therapeutic compositions useful in 
practicing the therapeutic methods of this invention. A subject therapeutic 
composition includes, in admixture, a pharmaceutically acceptable excipient 
(carrier) and one or more of a receptor recognition factor, polypeptide analog 
thereof or fragment thereof, as described herein as an active ingredient. In a 
preferred embodiment, the composition comprises an antigen capable of 
modulating the specific binding of the present recognition factor within a target 
cell. 

The preparation of therapeutic compositions which contain polypeptides, analogs 
or active fragments as active ingredients is well understood in the art. Typically, 
such compositions are prepared as injectables, either as liquid solutions or 
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suspensions, however, solid forms suitable for solution in, or suspension in, liquid 
prior to injection can also be prepared. The preparation can also be emulsified. 
The active therapeutic ingredient is often mixed with excipients which are 
pharmaceutically acceptable and compatible with the active ingredient. Suitable 
excipients are, for example, water, saline, dextrose, glycerol, ethanol, or the like 
and combinations thereof. In addition, if desired, the composition can contain 
minor amounts of auxiliary substances such as wetting or emulsifying agents, pH 
buffering agents which enhance the effectiveness of the active ingredient. 

A polypeptide, analog or active fragment can be formulated into the therapeutic 
composition as neutralized pharmaceutically acceptable salt forms. 
Pharmaceutically acceptable salts include the acid addition salts (formed witii the 
free amino groups of the polypeptide or antibody molecule) and which are formed 
with inorganic acids such as, for example, hydrochloric or phosphoric acids, or 
such organic acids as acetic, oxalic, tartaric, mandelic, and tiie like. Salts formed 
from the free carboxyl groups can also be derived from inorganic bases such as, 
for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and 
such organic bases as isopropylamine, trimethylamine, 2-ethylamino ethanol, 
histidine, procaine, and die like. 

The therapeutic polypeptide-, analog- or active fragment-containing compositions 
are conventionally administered intravenously, as by injection of a unit dose, for 
example. The term "unit dose" when used in reference to a therapeutic 
composition of the present invention refers to physically discrete units suitable as 
unitary dosage for humans, each unit containing a predetermined quantity of active 
material calculated to produce the desired therapeutic effect in association with the 
required diluent; i.e., carrier, or vehicle. 

The compositions are administered in a manner compatible with the dosage 
formulation, and in a therapeutically effective amount. The quantity to be 
administered depends on the subject to be treated, capacity of the subject's 
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immune system to utilize the active ingredient, and degree of inhibition or 
neutralization of recognition factor binding capacity desired. Precise amounts of 
active ingredient required to be administered depend on the judgment of the 
practitioner and are peculiar to each individual. However, suitable dosages may 
range from about 0. 1 to 20, preferably about 0.5 to about 10, and more preferably 
one to several, milligrams of active ingredient per kilogram body weight of 
individual per day and depend on the route of administration. Suitable regimes for 
initial administration and booster shots are also variable, but are typified by an 
initial administration followed by repeated doses at one or more hour intervals by 
a subsequent injection or other administration. Alternatively, continuous 
intravenous infusion sufficient to maintain concentrations of ten nanomolar to ten 
micromolar in the blood are contemplated. 

The therapeutic compositions may further include an effective amount of the 
factor/factor synthesis promoter antagonist or analog thereof, and one or more of 
the following active ingredients: an antibiotic, a steroid. Exemplary formulations 
are given below: 




Formulations 



m 
m 
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Intravenous For mulation I 
Ingredient 
5 cefotaxime 

receptor recognition factor 
dextrose USP 
sodium bisulfite USP 
edetate disodium USP 
10 water for injection q.s.a.d. 

Intravenous For mulation n 
Ingredient 
ampicillin 
15 receptor recognition factor 
sodium bisulfite USP 
disodium edetate USP 
water for injection q.s.a.d. 

20 Intravenous Formulation m 
Ingredient 

gentamicin (charged as sulfate) 
receptor recognition factor 
sodium bisulfite USP 
25 disodium edetate USP 

water for injection q.s.a.d. 



mg/ml 
250.0 

10.0 
45.0 

3.2 

0.1 

1.0ml 



mg/ml 
250.0 

10.0 

3.2 

0.1 

1.0ml 



mglM 

40.0 

10.0 

3.2 

0.1 
' 1.0ml 
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Intravenous Formulation IV 



Ingredient 
recognition factor 
dextrose USP 



mg/ml 

10.0 

45.0 
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sodium bisulfite USP 



3.2 



edetate disodium USP 



0.1 



water for injection q.s.a.d. 



1.0 ml 



Intravenous Formulation V 
Ingredient 

recognition factor antagonist 
sodium bisulfite USP 



mg/ml 



5.0 



3,2 



disodium edetate USP 



0.1 



water for injection q.s.a.d. 



1.0 ml 



As used herein, "pg" means picogram, "ng" means nanogram, "ug" or Vg" mean 
microgram, "mg" means milligram, "ul" or "/il" mean microliter, "ml" means 
milliliter, T means liter. 

Another feature of this invention is the expression of the DNA sequences disclosed 
herein. As is well known in the art, DNA sequences may be expressed by 
operatively linking them to an expression control sequence in an appropriate 
expression vector and employing that expression vector to transform an 
appropriate unicellular host. 

Such operative linking of a DNA sequence of this invention to an expression 
control sequence, of course, includes, if not already part of the DNA sequence, ^ 
the provision of an initiation codon, ATG, in the correct reading frame upstream 
of the DNA sequence. 



A wide variety of host/expression vector combinations may be employed in 
expressing the DNA sequences of this invention. Useful expression vectors, for 
example, may consist of segments of chromosomal, non-chromosomal and 
Synthetic DNA sequences. Suitable vectors include derivatives of SV40 and 
known bacterial plasmids, e.g., E. coli plasmids col El, pCRl, pBR322, pMB9 



47 

and their derivatives, plasmids such as RP4; phage DNAS, e.g., the numerous 
derivatives of phage X, e.g., NM989, and other phage DNA, e.g., M13 and 
Filamentous single stranded phage DNA; yeast plasmids such as the 2/x plasmid or 
derivatives thereof; vectors useful in eukaryotic cells, such as vectors useful in 
insect or mammalian cells; vectors derived from combinations of plasmids and 
phage DNAS, such as plasmids that have been modified to employ phage DNA or 
other expression control sequences; and the like. 

Any of a wide variety of expression control sequences — sequences that control the 
expression of a DNA sequence operatively linked to it — may be used in these 
vectors to express the DNA sequences of this invention, ^uch useful expression 
control sequences include, for example, the early or late promoters of SV40, 
CMV, vaccinia, polyoma or adenovirus, the lac system, the trp system, the TAC 
system, the TRC system, the LTR system, the major operator and promoter regions 
of phage X, the control regions of fd coat protein, the promoter for 
3-phosphoglycerate kinase or other glycolytic enzymes, the promoters of acid 
phosphatase (e.g., Pho5), the promoters of the yeast a-mating factors, and other 
sequences known to control the expression of genes of prokaryotic or eukaryotic 
cells or their viruses, and various combinations thereof. 

A wide variety of unicellular host cells are also useful in expressing the DNA 
sequences of this invention. These hosts may include well known eukaryotic and 
prokaryotic hosts, such as strains of coli, Pseudomonas, Bacillus, 
Streptomyces, fungi such as yeasts, and animal cells, such as CHO, Rl.l, B-W and 
L-M cells, African Green Monkey kidney cells (e.g., COS 1, COS 7, BSCl, 
BSC40, and BMTIO), insect ceUs (e.g., Sf9), and human cells and plant cells in 
tissue culture. 

It will be understood that not all vectors, expression control sequences and hosts 
will function equally well to express the DNA sequences of this invention. 
Neither will all hosts function equally well with the same expression system. 




However, one skilled in the art will be able to select the proper vectors, 
expression control sequences, and hosts without undue experimentation to 
accomplish the desired expression without departing from the scope of this 
invention. For example, in selecting a vector, the host must be considered 
5 because the vector must function in it. The vector's copy number, the ability to 
control that copy number, and the expression of any other proteins encoded by the 
vector, such as antibiotic markers, will also be considered. 



m 

in 



In selecting an expression control sequence, a variety of factors will normally be 
10 considered. These include, for example, the relative strength of the system, its 
controllability, and its compatibility with the particular DNA sequence or gene to 



^ be expressed, particularly as regards potential secondary structures. Suitable 



unicellular hosts will be selected by consideration of, e.g., their compatibility with 
the chosen vector, their secretion characteristics, their ability to fold proteins 



m 
m 

15 correctly, and their fermentation requirements, as well as the toxicity to the host 



of the product encoded by the DNA sequences to be expressed, and the ease of 
purification of the expression products. 

Considering these and other factors a person skilled in the art will be able to 
20 construct a variety of vector/expression control sequence/host combinations that 
will express the DNA sequences of this invention on fermentation or in large scale 
animal culture. 



It is further intended that receptor recognition factor analogs may be prepared 
25 from nucleotide sequences of the protein complex/subunit derived within the scope 
of the present invention. Analogs, such as fragments, may be^produced, for 
example, by pepsin digestion of receptor recognition factor material. Other 
analogs, such as muteins, can be produced by standard site-directed mutagenesis of 
receptor recognition factor coding sequences. Analogs exhibiting "receptor 
30 recognition factor activity" such as small molecules, whether functioning as 

promoters or inhibitors, may be identified by known in vivo and/or in vitro assays. 
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As mentioned above, a DNA sequence encxxiing receptor recognition factor can be 
prepared synthetically rather than cloned. The DNA sequence can be designed 
with the appropriate codons for the receptor recognition factor amino acid 
sequence. In general, one will select preferred codons for the intended host if the 
5 sequence will be used for expression. The complete sequence is assembled from 
overlapping oligonucleotides prepared by standard methods and assembled into a 
complete coding sequence. See, e.g., Edge, Nature, 292:756 (1981); Nambair et 
Science, 223:1299 (1984); Jay et al., /. Biol Chem., 2JP:6311 (1984). 

10 Synthetic DNA sequences allow convenient construction of genes which will 
express receptor recognition factor analogs or "muteins". Alternatively, DNA 
encoding muteins can be made by site-directed mutagenesis of native receptor 
recognition factor genes or cDNAs, and muteins can be made directly using 
conventional polypeptide synthesis. 

15 

A general method for site-specific incorporation of unnatural amino acids into 
proteins is described in Christopher J. Noren, Spencer J. Anthony-Cahill, Michael 
C. Griffith, Peter G. Schultz, Science, 244:182-188 (April 1989). This method 
may be used to create analogs with unnatural amino acids. 

20 

The present invention extends to the preparation of antisense nucleotides and 
ribozymes that may be used to interfere with the expression of the receptor 
recognition proteins at the translational level. This approach utilizes antisense 
nucleic acid and ribozymes to block translation of a specific mRNA, either by 
25 masking that mRNA with an antisense nucleic acid or cleaving it with a ribozyme. 

Antisense nucleic acids are DNA or RNA molecules that are complementary to at 
least a portion of a specific mRNA molecule. (See Weintraub, 1990; 
Marcus-Sekura, 1988.) In the cell, they hybridize to that mRNA, forming a double 
30 stranded molecule. The cell does not translate an mRNA in this double-stranded 
form. Therefore, antisense nucleic acids interfere with the expression of mRNA 




into protein. Oligomers of about fifteen nucleotides and molecules that hybridize 
to the AUG initiation codon will be particularly efficient, since they are easy to 
synthesize and are likely to pose fewer problems than larger molecules when 
introducing them into receptor recognition factor-producing cells. Antisense 
5 methods have been used to inhibit the expression of many genes in vitro 
(Marcus-Sekura, 1988; Hambor et al., 1988). 

Ribozymes are RNA molecules possessing the ability to specifically cleave other 
single stranded RNA molecules in a manner somewhat analogous to DNA 

10 restriction endonucleases. Ribozymes were discovered from the observation that 
certain mRNAs have the ability to excise their own introfls. By modifying the 
nucleotide sequence of these RNAs, researchers have been able to engineer 
molecules that recognize specific nucleotide sequences in an RNA molecule and 
cleave it (Cech, 1988.). Because they are sequence-specific, only mRNAs with 

15 particular sequences are inactivated. 

Investigators have identified two types of ribozymes, Tetrahymena-type and 
"hammerhead "-type. (Hasselhoff and Gerlach, 1988) Tetrahymena-Xype. ribozymes 
recognize four-base sequences, while "hammerhead "-type recognize eleven- to 
20 eighteen-base sequences. The longer the recognition sequence, the more likely it 
is to occur exclusively in the target mRNA species. Therefore, hammerhead-type 
ribozymes are preferable to Teirahymena-typc ribozymes for inactivating a specific 
mRNA species, and eighteen base recognition sequences are preferable to shorter 
recognition sequences. 

25 

The DNA sequences described herein may thus be used to prepare antisense 
molecules against, and ribozymes that cleave mRNAs for receptor recognition 
factor proteins and their ligands. 

30 The present invention also relates to a variety of diagnostic applications, including 
methods for detecting the presence of stimuli such as the earlier referenced 
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polypeptide ligands, by reference to their ability to elicit the activities which are 
mediated by the present receptor recognition factor. As mentioned earlier, the 
receptor recognition factor can be used to produce antibodies to itself by a variety 
of known techniques, and such antibodies could then be isolated and utilized as in 
5 tests for the presence of particular transcriptional activity in suspect target cells. 

As described in detail above, antibody(ies) to the receptor recognition factor can 
be produced and isolated by standard methods including the well known hybridoma 
techniques. For convenience, the antibody(ies) to the receptor recognition factor 
10 will be referred to herein as Ab, and antibody(ies) raised in another species as 
Abj. 

The presence of receptor recognition factor in cells can be ascertained by the usual 
immunological procedures applicable to such determinations. A number of useful 

15 procedures are known. Three such procedures which are especially useful utilize 
either the receptor recognition factor labeled with a detectable label, antibody Ab^ 
labeled with a detectable label, or antibody Abj labeled with a detectable label. 
The procedures may be summarized by the following equations wherein the 
asterisk indicates that the particle is labeled, and "RRF" stands for the receptor 

20 recognition factor: 

A. RRF* + Abj = RRF*Abi 

B. RRF + Ab* = RRFAbj* 

C. RRF + Abi -f Abj* = RRFAbjAbj* 

25 The procedures and their application are all familiar to those skilled in the art and 
accordingly may be utilized within the scope of the present invention. The 
"competitive" procedure, Procedure A, is described in U.S. Patent Nos. 3,654,090 
and 3,850,752. Procedure C, the "sandwich- procedure, is described in U.S. 
Patent Nos. RE 31,006 and 4,016,043. StiU other procedures are known such as 

30 the "double antibody", or "DASP" procedure. 




In each instamce, the receptor recognition factor forms complexes with one or 
more antibody(ies) or binding partners and one member of the complex is labeled 
with a detectable label. The fact that a complex has formed and, if desired, the 
amount thereof, can be determined by known methods applicable to the detection 
5 of labels. 

It will be seen from the above, that a characteristic property of Ab^ is that it will 
react with Abj. This is because Abj raised in one mammalian species has been 
used in another species as an antigen to raise the antibody Abj. For example, Ab2 
10 may be raised in goats using rabbit antibodies as antigens. Abj therefore would be 
anti-rabbit antibody raised in goats. For purposes of this. description and claims, 
Abi will be referred to as a primary or anti-receptor recognition factor antibody, 
and Ahj will be referred to as a secondary or anti-Ab^ antibody. 

15 The labels most commonly employed for these studies are radioactive elements, 
enzymes, chemicals which fluoresce when exposed to ultraviolet light, and others. 

A number of fluorescent materials are known and can be utilized as labels. These 
include, for example, fluorescein, rhodamine and auramine. A particular detecting 
20 material is anti-rabbit antibody prepared in goats and conjugated with fluorescein 
through an isothiocyanate. 

The receptor recognition factor or its binding partner(s) can also be labeled with a 
radioactive element or with an enzyme. The radioactive label can be detected by 
25 any of the currently available counting procedures. The preferred isotope may be 
selected from ^H, »^C, ^^P, ^'S, ^Cl, ^*Cr, ^^Co, «Co, ^^e, ^\ ^"I, ^^^I, and 
***Re, 



30 



Enzyme labels are likewise useful, and can be detected by any of the presently 
utilized colorimetric, spectrophotometric, fluorospectrophotometric, amperometric 
or gasometric techniques. The enzyme is conjugated to the selected particle by 
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reaction with bridging molecules such as carbodiiniides, diisocyanates, 
glutaraldehyde and the like. Many enzymes which can be used in these 
procedures are known and can be utilized. The preferred are peroxidase, 
B-glucuronidase, B-D-glucosidase, B-D-galactosidase, urease, glucose oxidase plus 
peroxidase and alkaline phosphatase. U.S. Patent Nos. 3,654,090; 3,850,752; and 
4,016,043 are referred to by way of example for their disclosure of alternate 
labeling material and methods. 

A particular assay system developed and utilized in accordance with the present 
invention, is known as a receptor assay. In a receptor assay, the material to be 
assayed is appropriately labeled and then certain cellular Test colonies are 
inoculated with a quantity of both the labeled and unlabeled material after which 
binding studies are conducted to determine the extent to which the labeled material 
binds to the cell receptors. In this way, differences in affinity between materials 
can be ascertained. 

Accordingly, a purified quantity of the receptor recognition factor may be 
radiolabeled and combined, for example, with antibodies or other inhibitors 
thereto, after which binding studies would be carried out. Solutions would then be 
prepared that contain various quantities of labeled and unlabeled uncombined 
receptor recognition factor, and cell samples would then be inoculated and 
thereafter incubated. The resulting cell monolayers are then washed, solubilized 
and then counted in a gamma counter for a length of time sufficient to yield a 
standard error of <5%- These data are then subjected to Scatchard analysis after 
which observations and conclusions regarding material activity can be drawn. 
While the foregoing is exemplary, it illustrates the manner in which a receptor 
assay may be performed and utilized, in the instance where the cellular binding 
ability of the assayed material may serve as a distinguishing characteristic. 

An assay useful and contemplated in accordance with the present invention is 
known as a "cis/trans" assay. Briefly, this assay employs two genetic constructs, 
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one of which is typically a plasmid that continually expresses a particular receptor 
of interest when transfected into an appropriate cell line, and the second of which 
is a plasmid that expresses a reporter such as luciferase, under the control of a 
receptor/ligand complex. Thus, for example, if it is desired to evaluate a 
5 compound as a ligand for a particular receptor, one of the plasmids would be a 
construct that results in expression of the receptor in the chosen cell line, while the 
second plasmid would possess a promoter linked to the luciferase gene in which 
the response element to the particular receptor is inserted. If the compound under 
test is an agonist for the receptor, the ligand will complex with the receptor, and 

10 the resulting complex will bind the response element and initiate transcription of 
the luciferase gene. The resulting chemiluminescence is -then measured 
photometrically, and dose response curves are obtained and compared to those of 
known ligands. The foregoing protocol is described in detail in U.S. Patent No. 
4,981,784 and PCT International Publication No. WO 88/03168, for which 

15 purpose the artisan is referred. 

In a further embodiment of this invention, commercial test kits suitable for use by 
a medical specialist may be prepared to determine the presence or absence of 
predetermined transcriptional activity or predetermined transcriptional activity 

20 capability in suspected target cells. In accordance with the testing techniques 
discussed above, one class of such kits will contain at least the labeled receptor 
recognition factor or its binding partner, for instance an antibody specific thereto, 
and directions, of course, depending upon the method selected, e.g., 
"competitive", "sandwich", "DASP" and the like. The kits may also contain 

25 peripheral reagents such as buffers, stabilizers, etc. 

Accordingly, a test kit may be prepared for the demonstration of the presence or 
capability of cells for predetermined transcriptional activity, comprising: 

(a) a predetermined amount of at least one labeled immunochemically reactive 
30 component obtained by the direct or indirect attachment of the present receptor 
recognition factor or a specific binding partner thereto, to a detectable label; 
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(b) other reagents; and 

(c) directions for use of said kit. 



More specifically, the diagnostic test kit may comprise: 

(a) a known amount of the receptor recognition factor as described above (or 
a binding partner) generally bound to a solid phase to form an immunosorbent, or 
in the alternative, bound to a suitable tag, or plural such end products, etc. (or 
their binding partners) one of each; 

(b) if necessary, other reagents; and 

(c) directions for use of said test kit. 

In a further variation, the test kit may be prepared and used for the purposes stated 
above, which operates according to a predetermined protocol (e.g. "competitive", 
"sandwich", "double antibody", etc.), and comprises: 

(a) a labeled component which has been obtained by coupling the receptor 
recognition factor to a detectable label; 

(b) one or more additional immunochemical reagents of which at least one 
reagent is a ligand or an immobilized ligand, which ligand is selected from the 
group consisting of: 

(i) a ligand capable of binding with the labeled component (a); 

(ii) a ligand capable of binding with a binding partner of the labeled 
component (a); 

(iii) a ligand capable of binding with at least one of the component(s) to 
be determined; and 

(iv) a ligand capable of binding with at least ; 

one of the binding partners of at least one of the component(s) to be determined; 
and 

(c) directions for the performance of a protocol for the detection and/or 
determination of one or more components of an immunochemical reaction between 
the receptor recognition factor and a specific binding partner thereto. 




In accordance with the above, an assay system for screening potential drugs 
effective to modulate the activity of the receptor recognition factor may be 
prepared. The receptor recognition factor may be introduced into a test system, 
and the prospective drug may also be introduced into the resulting cell culture, and 
the culture thereafter examined to observe any changes in the transcriptional 
activity of the cells, due either to the addition of the prospective drug alone, or 
due to the effect of added quantities of the known receptor recognition factor. 

PRELTNfTNARY CONSIDER ATTONS 

As mentioned earlier, the observation and conclusion underlying the present 
invention were crystallized from a consideration of the results of certain 
investigations with particular stimuli. Particularly, the present disclosure is 
illustrated by the results of work on protein factors that govern transcriptional 
control of IFNa-stimulated genes, as well as more recent data on the regulation of 
transcription of genes stimulated by IFN7. The following is a brief discussion of 
the role that IFN is believed to play in the stimulation of transcription taken from 
Darnell et al. THE NEW BIOLOGIST, 2(10), (1990). 

Activation of genes by IFNa occurs within minutes of exposure of cells to this 
factor (Lamer et al., 1984, 1986) and is strictly dependent on the IFNa binding to 
its receptor, a 49-kD plasma membrane polypeptide (Uze et al., 1990). However, 
changes in intracellular second messenger concentrations secondary to the use of 
phorbol esters, calcium ionophores, or cyclic nucleotide analogs neither triggers 
nor blocks EFNa-dependent gene activation (Lamer et al., 1984; Lew et al., 
1989), No other polypeptide, even IFN7, induces the set of interferon-stimulated 
genes (ISGs) specifically induced by IFNa. In addition, it has been found that 
IFN7-dependent transcriptional stimulation of at least one gene in HeLa cells and 
in fibroblasts is also strictly dependent on receptor-ligand interaction and is not 
activated by induced changes in second messengers (Decker et al., 1989; Lew et 
al., 1989). These highly specific receptor-ligand interactions, as well as the 
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precise transcriptional response, require the intracellular recognition of receptor 
occupation and the communication to the nucleus to be equally specific. 

The activation of ISGs by IFNcr is carried out by transcriptional factor ISGF-3, or 
interferon stimulated gene factor 3, This factor is activated promptiy after IFNa 
treatment without protein synthesis, as is transcription itself (Lamer et al., 1986; 
Levy et al., 1988; Levy et al,, 1989). ISGF-3 binds to the ISRE, the interferon- 
stimulated response element, in DNA of the response genes (Reich et al., 1987; 
Levy et al., 1988), and this binding is affected by all of an extensive set of 
mutations that also affects the transcriptional function of the ISRE (Kessler et al., 
1988a), Partially purified ISGF-3 containing no other DNA-binding components 
can stimulate ISRE-dependent in vitro transcription (Fu et al., 1990). IFN- 
dependent stimulation of ISGs occurs in a cycle, reaching a peak of 2 hours and 
declining promptiy thereafter (Lamer et al., 1986). ISGF-3 follows the same 
cycle (Levy et al., 1988, 1989). Finally, the presence or absence or ISGF3 in a 
variety of IFN-sensitive and IFN-resistant cells correlates with the transcription of 
ISGs in these cells (Kessler et al., 1988b). 

ISGF-3 is composed of two subfractions, ISGF-3a and ISGF-37, that are found in 
the cytoplasm before IFN binds to its receptor (Levy et al., 1989). When cells are 
treated with IFNa, ISGF-3 can be detected in the cytoplasm within a minute, that 
is, some 3 to 4 minutes before any ISGF-3 is found in the nucleus (Levy et al., 
1989). The cytoplasmic component ISGF-37 can be increased in HeLa cells by 
pretreatment with IFN7, but IFN7 does not by itself activate transcription of ISGs 
nor raise the concentration of the complete factor, ISGF-3 (Le\ry et al., 1990). 
The cytoplasmic localization of tiie proteins that interact to constitute ISGF-3 was 
proved by two kinds of experiments. When cytoplasm of IFN7-treated cells that 
lack ISGF-3 was mixed with cytoplasm of IFNa-treated cells, large amounts of 
ISGF-3 were formed (Levy et al., 1989). (It was this experiment that indicated 
the existence of an ISGF-37 component and an ISGF-3a component of ISGF-3). 
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In addition, Dale et al. (1989) showed that enucleated cells could respond to EFNa 
by fonning a DNA-binding protein that is probably the same as ISGF-3. 

The ISGF-37 component is a 48-kD protein that specifically recognizes the ISRE 
5 (Kessler et al., 1990; Fu et al., 1990). Three other proteins, presumably 

constituting the ISGF-3a component, were found in an ISGF-3 DNA complex (Fu 
et al., 1990). The entirety of roles of, or the relationships among these three 
proteins are not yet known, but it is clear that ISGF-3 is a multimeric protein 
complex. Since the binding of IFNa to the cell surface converts ISGF-3a from an 
10 inactive to an active status within a minute, at least one of the proteins constituting 
ISGF-3a must be affected prompUy, perhaps by a direct interaction with the DFNoe 
receptor. 

5 

IM The details of how the ISGF-37 component and the three other proteins are 

^ 15 activated by cytoplasmic events and then enter the nucleus to bind the ISRE and 

increase transcription are not entirely known. Further studies of the individual 
J proteins, for example, with antibodies, are presented herein. For example, it is 

clear that, within 10 minutes of IFNa treatment, there is more ISGF-3 in the 
nucleus than in the cytoplasm and that the complete factor has a much higher 



O 20 affinity for the ISRE than the 48-kD ISGF-37 component by itself (Kessler et al., 

0 

1990). 

In summary, the attachment of interferon-a (IFN-a) to its specific cell surface 
receptor activates the transcription of a limited set of genes, termed ISGs for 

25 "interferon stimulated genes" [Lamer et al., PROC. NATL. ACAD. SCL USA, 81 
(1984); Lamer et al., J. BIOL. CHEM., 261 (1986); Friedmaii et al., CELL, 38 
(1984)]). The observation that agents that affect second messenger levels do not 
activate transcription of these genes, led to the proposal that protein :protein 
interactions in the cytoplasm beginning at the EFN receptor might act directly in 

30 transmitting to the nucleus the signal generated by receptor occupation [Levy et 
al., NEW BIOLOGIST, 2 (1991)]. 
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To test this hypothesis, the present applicants began experiments in the nucleus at 
the activated genes. Initially, the ISRE and ISGF-3 were discovered [Levy et al., 
GENES Sl dev., 2 (1988)]. 

5 Partial purification of ISGF-3 followed by recovery of the purified proteins from a 
specific DNA-protein complex revealed that the complete complex was made up of 
four proteins [Fu et al., PROa NATL. ACAD. SCL USA, 87 (1990); Kessler et 
al,, GENES & DEV., 4 (1990)]. A 48 kD protein termed ISGF-Sy, because 
pre-treatment of HeLa cells with IFN-7 increased its presence, binds DNA weakly 
10 on its own [Ibid.; and Levy et al., THE EMBO. 7., 9 (1990)]. In combination 
with the IFN-a activated proteins, termed collectively the ISGF-3a proteins, the 
ISGF-37 forms a complex that binds the ISRE with a 50-fold higher affmity 
..p [Kessler et al., GENES & DEV., 4 (1990)]. The ISGF-3a proteins comprise a set 

of polypeptides of 113, 91 and 84 kD. All of the ISGF-3 components initially 
1: 15 reside in the cell cytoplasm [Levy et al., G£iVE5 <fe DEV., 3 (1989); Dale et al., 

Ifil PROC. NATL. ACAD. SCL USA, 86 (1989)]. However after only about five 

Q nunutes of EFN-a treatment the active complex is found in the cell nucleus, tiius 

k'! confirming these proteins as a possible specific link from an occupied receptor to a 

O limited set of genes [Levy et al., GENES & DEV., 3 (1989)]. 

CI 20 

In accordance with the present invention, specific proteins comprising receptor 
recognition factors have been isolated and sequenced. These proteins, their 
fragments, antibodies and other constructs and uses thereof, are contemplated and 
presented herein. To understand the mechanism of cytoplasmic activation of the 
25 ISGF-3a proteins as well as their transport to the nucleus and interaction with 

i 

ISGF-37, this factor has been purified in sufficient quantity to obtain peptide 
sequence from each protein. Degenerate deoxyoligonucleotides that would encode 
the peptides were constructed and used in a combination of cDNA library 
screening and PGR amplification of cDNA products copied from mRNA to 
30 identify cDNA clones encoding each of the four proteins. What follows in the 
examples presented herein a description of the final protein preparations that 
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allowed the cloning of cDNAs encoding all the proteins, and the primary sequence 
of the 113 kD protein arising from a first gene, and the primary sequences of the 
91 and 84 kD proteins which appear to arise from two differently processed RNA 
products from another gene, Antisera against portions of the 84 and 91 kD 
5 proteins have also been prepared and bind specifically to the ISGF-3 DNA binding 
factor (detected by the electrophoretic mobility shift assay with cell extracts) 
indicating that these cloned proteins are indeed part of ISGF-3, The availability of 
the cDNA and the proteins they encode provides the necessary material to 
understand how the liganded IFN-a receptor causes immediate cytoplasmic 
10 activation of the ISGF-3 protein complex, as well as to understand the mechanisms 
iQ of action of the receptor recognition factors contemplated herein. The cloning of 

each of ISGF3-a proteins, and the evaluation and confirmation of the particular 



f?i role played by the 91 kD protein as a messenger and DNA binding protein in 

^ response to IFN-7 activation, including the development and testing of antibodies 



iPgi 15 to the receptor recognition factors of the present invention, are all presented in the 

" examples that follow below. 

Q 

pj EXAMPLE 1 

if'*! 

20 To purify relatively large amounts of ISGF-3, HeLa cell nuclear extracts were 
prepared from cells treated overnight (16-18 h) with 0.5 ng/ml of IFN-7 and 45 
min. with EFN-a (500u/ml). The steps used in the large scale purification were 
modified slightly from those described earlier in the identification of the four 
ISGF-3 proteins, 

25 ' i 

Accordingly, nuclear extracts were made from superinduced HeLa cells [Levy et 
al,, THE EMBO. 9 (1990)] and chromatographed as previously described [Fu 
et al., PROa NATL. ACAD. SCI. USA, 87 (1990)] on: phosphocellulose P-U, 
heparin agarose (Sigma); DNA cellulose (Boehringer Mannheim; flow through was 

30 collected after the material was adjusted to 0.28M KCl and 0.5% NP-40); two 
successive rounds of ISRE oligo affinity column (1.8 ml column, eluted with a 
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linear gradient of 0.05 to l.OM KCl); a point mutant ISRE oligonucleotide affinity 
column (flow through was collected after the material was adjusted to 0.28M 
KCl); and a final round on the ISRE oligonucleotide column (material was eluted 
in a linear 0.05 to l.OM NaCl gradient adjusted to 0.05% NP-40). Colunm 
5 firactions containing ISGF-3 were subsequently examined for purity by SDS 
PAGE/silver staining and pooled appropriately. The pooled fractions were 
concentrated by a centricon-10 (Amicon). The pools of fractions from 
preparations 1 and 2 were combined and run on a 10 cm wide, 1.5 mm thick 
7.5% SDS polyacryiamide gel. The proteins were electroblotted to nitrocellulose 
10 for 12 hrs at 20 volts in 12.5% MeOH, 25mM Tris, 190 mM glycine. The 

membrane was stained with "0, 1 % Ponceau Red (in 1 % acetic acid) and the bands 
of 113 kD, 91 kD, 84 kD, and 48 kD excised and subjected to peptide analysis 
iffl after tryptic digestion [Wedrychowski et al., J. BIOL. CHEM., 265 (1990); 

5 Aebersold et al., PROC. NATL. ACAD. SCL USA, 84 (1987)]. The resulting 

15 peptide sequences for the 91 kD and 84 kD proteins are indicated in Fig. 6. 
Degenerate oligonucleotides were designed based on the peptide sequences tl9, 
tl3b and t27: (Forward and Reverse complements are denoted by F and R: 
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19F AACGTIGACCAATTNAACATG (SEQ ID NO: 14) 

20 T T GC T 



T 

13bR GTCGATGTTNGGGTANAG (SEQ ID NO: 15) 

25 A A A A A 

27R GTACAAITCAACCAGNGCAA (SEQ ID NO: 16) 

T TG T T 

i 

30 

The final ISRE oligonucleotide affinity selection yielded material with the SDS 
polyacryiamide gel electrophoretic pattern shown in Fig, 4 (l^ft). This gel 
represented about 1.5% of the available material purified from over 200 L of 
appropriately treated HeLa cells. While 113, 91, 84 and 48 kD bands were 
35 clearly prominent in the final purified preparation (see Fig. 4, right panel), there 
were also two prominent contaminants of about 118 and 70 kD and a few of other 
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contaminants in lower amounts. [Amino acid sequence data have shown that the 
contaminants of 86 kD and 70 kD are the KU antigen, a widely-distributed protein 
that binds DNA termini. However in the specific ISGF-3: ISRE complex there is 
no KU antigen and therefore it has been assigned no role in IFN-dependent 
5 transcriptional stimulation, [Wedrychowski et al., J. BIOL. CHEM., 265 (1990)]]. 

Since the mobility of the 113, 91, 84, and 48 kD proteins could be accurately 
marked by comparison with the partially purified proteins characterized in 
previous experiments [Fu et al., PROC. NATL. ACAD. SCL USA, 87 (1990)], 
10 further purification was not attempted at this stage. The total purified sample 
from 200 L of HeLa cells was loaded onto one gel, subjected to electrophoresis, 
transferred to nitrocellulose and stained with Ponceau red. The 113, 84, 91, and 
,111 48 kD protein bands were separately excised and subjected to peptide analysis as 

described [Aebersold et al., PROC. NATL. ACAD. SCL USA, 84 (1987)]. 
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;;[: 15 Released peptides were collected, separated by HPLC and analyzed for sequence 
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content by automated Edman degradation analysis. 

Accordingly, the use of the peptide sequence data for three of four peptides from 
the 91 kD protein and a single peptide derived from the 84 kD protein is described 
20 herein. The peptide sequence and the oligonucleotides constructed from them are 
given in the legend to Fig. 4 or 6. When oligonucleotides 19F and 13bR were 
used to prime synthesis from a HeLa cell cDNA library, a PGR product of 475 bp 
was generated. When this product was cloned and sequenced it encoded the 13a 
peptide internally. Oligonucleotide 27R derived from the only available 84 kD 
25 peptide sequence was used in an anchored PGR procedure amplifying a 405 bp 
segment of DNA. This 405 bp amplified sequence was identical to an already 
sequenced region of the 91 kD protein. It was then realized that the peptide t27 
sequence was contained within peptide tl9 and that the 91 and 84 kD proteins 
must be related (see Fig. 5 & 7). Oligonucleotides 19F and 13a were also used to 
30 select candidate cDNA clones from a cDNA library made from mRNA prepared 
after 16 hr. of IFN-7 and 45 min. of IFN-a treatment. 
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Of the numerous cDNA clones that hybridized these oligonucleotides and also the 
cloned PGR products, one cDNA clone, E4, contained the largest open reading 
frame flanked by inframe stop codons. Sequence of peptides tl9, tl3a, and tl3b 
were contained in this 2217 bp ORF (see Fig. 6) which was sufficient to encode a 
5 protein of 739 amino acids (calculated molecular weight of 86 kD), The codon for 
the indicated initial methionine was preceded by three in frame stop codons. This 
coding capacity has been confirmed by translating in vitro an RNA copy of the E4 
clone yielding product of nominal size of 86 kD, somewhat shorter than the in 
vitro purified 91 kD protein discussed earlier (data not shown). Perhaps this result 
10 indicates post-translational modification of the protein in the cell. 

iSI A second class of clones was also identified (see Fig, 5). E3, the prototype of this 

|j class was identical to E4 from the 5* end to bp 2286 (aa 701) at which point the 

\; sequences diverged completely. Both cDNAs terminated with a poly(A) tail. 

15 Primer extension analysis suggested another - 150 bp were missing from the 5* 
end of both mRNAs. DNA probes were made from the clones representing both 
common and unique sequences for use in Northern blot analyses. The preparation 
III of the probes is as follows: 20 mg of cytoplasmic RNA (0.5% NP-40 lysate) of 

fn IFN-a treated (6 h) HeLa RNA was fractionated in a 1% agarose, 6% 
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20 fonnaldehyde gel (in 20 mM MOPS, 5mM NaAc, 1 mM EDTA, pH 7.0) for 4.5 
h at 125 volts. The RNA was transferred in 20 x SSC to Hybond-N (Amersham), 
UV crosslinked and hybridized with 1x106 cpm/nU of the indicated probes 
(1.5x10* cpm/mg). 

25 Probes from regions common to E3 and E4 hybridized to two |WA species of 
approximately 3.1 KB and 4.4 KB. Several probes derived from the 3' 
non-coding end of E4, which were unique to E4, hybridized only the larger RNA 
species. A labeled DNA probe from the unique 3' non-coding end of E3 
hybridized only the smaller RNA species. 

30 
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Review of the sequence at the site of 3' discontinuity between E3 and E4 
suggested that the shorter mRNA results from choice of a different poly(A) site 
and 3' exon that begins at bp 2286 (the calculated molecular weight from the E3, 
The last two nucleotides before the change are GT followed by GT in E3 in line 
5 with the consensus nucleotides at an exon-intron junction. Since the ORF of E4 
extends to bp 2401 it encodes a protein that is 38 amino acids longer than the one 
encoded by E3, but is otherwise identical (ORF is 82 kD). 

Since there is no direct assay for the activity of the 91 or 84 kD protein, an 
10 independent method was needed to determine whether the cDNA clones we had 
O isolated did indeed encode proteins that are part of ISGF-3, For this purpose 

,^ antibodies were initially raised against the sequence from amino acid 597 to anuno 
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acid 703 (see Fig. 6) by expressing this peptide in the pGEX-3X vector (15) as a 
bacterial fusion protein. This antiserum (a42) specifically recognized the 91 kD 
III 15 and 84 kD proteins in both crude extracts and purified ISGF-3 (see Fig. 7a). 

More importantly this antiserum specifically affected the ISGF-3 band in a 
mobility shift assay using the labeled ISRE oligonucleotide (see Fig. 7b) 



O 

m 

p confirming that the isolated 91 kD and 84 kD cDNA clones (E4 and E3) represent 

a component of ISGF-3. Additional antisera were raised against the amino 
20 terminus and carboxy terminus of the protein encoded by E4. The amino terminal 
59 amino acids that are common to both proteins and the unique carboxy terminal 
34 amino acids encoded only by the larger mRNA were expressed as fusion 
proteins in pGEX-3X for immunization of rabbits. Western blot analysis with 
highly purified ISGF-3 demonstrated that the amino terminal antibody (a55) 
25 recognized both the 91 kD and 84 kD proteins as expected. Hpwever, the other 
antibody (a57) recognized only the 91 kD protein confirming our assumption that 
the larger mRNA (4.4 KB) and larger cDNA encodes the 91 kD protein while the 
shorter mRNA (3,1 KB) and cDNA encodes the 84 kD protein (see Fig. 7a). 
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EXAMPLE 2 



65 

In this example, the cloning of the 1 13 kD protein that comprises one of the three 
ISGF-3a components is disclosed. 

From SDS gels of highly purified ISGF-3, the 113 kD band was identified, 
5 excised and subjected to cleavage and peptide sequence analysis [Aebersold et al., 
PROC. NATL. ACAD. SCI. USA, 87 (1987)]. Five peptide sequences (A-E) were 
obtained (Fig. 8A). Degenerate oligonucleotide probes were designed according to 
these peptides which then were radiolabeled to search a human cDNA Ubrary for 
clones that might encode the 1 13 kD protein. Hghteen positive cDNA clones 
10 were recovered from 2.5 x 10* phage plaques with the probe derived from peptide 
E (Fig. 8A, and the legend):" Two of them were completely sequenced. Clone fll 
contained a 3.2 KB cDNA, and clone ka31 a 2.6 KB cDNA that overlapped about 
||j 2 KB but which had a further extended 5' end in which a candidate AUG miUaUon 

% codon was found associated with a well-conserved Kozak sequence [Kozak, 

£ 15 NUCLEIC ACIDS RES., 12 



In addition to the phage cDNA clones, a PGR product made between 
oligonucleotides that encoded peptide D and E also yielded a 474 NT fragment 
that when sequenced was identical with the cDNA clone in this region. A 

20 combination of these clones fll and ka31 revealed an open reading frame capable 
of encoding a polypeptide of 851 amino acids (Fig. 8A). These two clones were 
joined within their overlapping region and RNA transcribed from this recombinant 
clone was translated in vitro yielding a polypeptide that migrated in an SDS gel 
with a nominal molecular weight of 105 kD (Fig. 9A). An appropriate clone 

25 encoding the 91 kD protein was also transcribed and the RNA translated in the 
same experiment. Since both the apparently complete cDNA clones for the 113 
kD protein and the 91 kD protein produce RNAs that when translated into proteins 
migrate somewhat faster than the proteins purified as ISGF-3 components, it is 
possible that the proteins undergo post-transladonal modification in the ceU causing 

30 them to be slightly retarded during electrophoresis. When a 660 bp cDNA 
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encoding the most 3' end of the 113 kD protein was used in a Northern analysis, a 
single 4.8 KB mRNA species was observed (Figure 9B). 

No independent assay is known for the activity of the 113 kD (or indeed any of 

5 the ISGF-30 proteins,) but it is known that the protein is part of a DNA binding 
complex that can be detected by an electrophoretic mobility shift assay [Fu et al., 
PROC. NATL. ACAD. SCI. USA, 97 (1990)]. Antibodies to DNA binding 
proteins are known to affect the formation or migration of such complexes. 
Therefore antiserum to a polypeptide segment (amino acid residues 323 to 527) 

10 fused with bacterial glutathione synthetase [Smith et al., PROC. NATL. ACAD. 
SCL USA, 83 (1986)] was raised in rabbits to determine.the reactivity of the 
ISGF-3 proteins with the antibody. A Western blot analysis showed that the 
antiserum reacted predominanUy with a 113 kD protein both in the ISGF3 fraction 
purified by specific DNA affinity chromatography (Lane 1) and in crude ceU 

15 extract (Lane 2, Fig. lOA). The weak reactivity to lower protein bands was 
possibly due to 113 kD protein degradation. Most importantly, the antiserum 
specifically removed almost all of the gel-shift complex leaving some of the 
oligonucleotide probe in "shifted-shift" complexes which were specifically 
competed away with a 50 fold molar excess of the oligonucleotide binding site (the 

20 ISRE, ref. 2) for ISGF3 (Fig. lOB). Notably, this antiserum had no effect on the 
faster migrating shift band produced by ISGF3-7 component alone (Figure lOB). 
Thus it appeared that the antiserum to the 113 kD fusion product does indeed react 
with another protein that is part of the complete ISGF-3 complex. 

25 A detailed sequence comparison between the 113 and 91 sequences followed (Fig. 
8B): while the nucleotide sequence showed only a distant relationship between the 
two proteins, there were long stretches of amino acid identity. These conserved 
regions were scattered throughout almost the entire 715 amino acid length encoded 
by the 91/84 clone. It was particularly striking that the regions corresponding to 

30 amino acids 1 to 48 and 317 to 353 and 654 to 678 in the 113 sequence were 60% 
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to 70% identical to corresponding regions of the 91 kD sequence. Thus the genes 
encoding the 113 and 84/91 proteins are closely related but not identical. 

Through examination for possible consensus sequences that might reveal 
5 sub-domain structures in the 113 kD or 84/91 kD sequence, it was found that both 
proteins contained regions whose sequence might form a coil structure with heptad 
leucine repeats. This occurred between amino acid 210 and 245 in the 113 kD 
protein and between 209 and 237 in the 84/91 protein. In both the 113 kD and the 
91/84 kD sequences, 4 out of 5 possible heptad repeats were leucine and one was 
10 valine. Domains of this type might provide a protein surface that encourages 

homo-or heterotypic protein interactions which have been observed in several other 
transcription factors (Vinson et al., SCIENCE, 246 (1989)]. An extended acidic 
domain was located at the carboxyl terminal of the 113 kD protein but not in 91 
kD protein (Fig. 8A), possibly implicating the 113 kD protein in gene activation 
ii 15 [Hope et al., Ma et al., CELL, 48 (1987)]. 

rij 

DLSCUSSION 

When compared at moderate or high stringency to the Genbank and EMBL data 
bases, there were no sequences like 113 or the 84/91 sequence. Preliminary PGR 
20 experiments however indicate that there are other family members with different 
sequences recoverable from a human cell cDNA Ubrary (Qureshi and Darnell 
unpublished). Thus, it appears that the 113 and 84/91 sequences may represent 
the first two members to be cloned of a larger family, of proteins. We would 
hypothesize that the 113 kD and 84/91 kD proteins may act as signal transducers, 
25 somehow interacting with the internal domain of a liganded IFNa receptor or its 

t 

associated protein and further that a family of waiting cytoplasihic proteins exist 
whose purpose is to be specific signal transducers when different receptors are 
occupied. Many experiments lie ahead before this general hypothesis can be 
crucially tested. Recent experiments have indicated that inhibitors of protein 
30 kinases can prevent ISGF-3 complex formulation [Reich et al., PROC, NATL. 
ACAD. SCL USA, 87 (1990); Kessler et al., /. BIOL. CHEM., 266 (1991)]. 
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However, neither the IFNa or IFN7 receptors that have so far been cloned have 
intrinsic kinase activity [Uze et al., CELL, dO (1990); Aguet et al., CELL, SS 
(1988)]. We would speculate that either a second receptor chain with kinase 
activity or a separate kinase bound to a liganded receptor could be a part of a 
5 complex that would convey signals to the ISGF-3a proteins at the inner surface of 
the plasma membrane. 



From the above, it has been concluded that accurate pepdde sequence from 
ISGF-3 protein components have been determined, leading to correct identification 
10 of cDNA clones encoding the 113, 91 and 84 kD components of ISGF-3 . Since 
staurosporine, a broadly effective kinase inhibitor blocks TFN-a induction of 
ijl transcription and of ISGF-3 formation [Reich et al., PROC. NATL. ACAD. SCL 

USA, 87 (1990); Kessler et al., J. BIOL. CHEM., 266 (1991)] it seems possible 
that the ISGF-3a proteins are direct cytoplasmic substrates of a liganded 
15 receptor-associated kinase. The antiserum against these proteins should prove 
invaluable in identifying the state of the ISGF-3a proteins before and after IFN 
treatment and will allow the direct exploration of the biochemistry of signal 
transduction from the IFN receptor. 
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As mentioned earlier, the observation and conclusion underlying the present 
invention were crystallized from a consideration of the results of certain 
investigations with particular stimuli. Particularly, the present disclosure is 
25 illustrated by the results of work on protein factors that govern ^transcriptional 

control of IFNa-stimulated genes, as well as more recent data on the regulation of 
transcription of genes stimulated by IFN7. 

For example, there is evidence that the 91 kD protein is the tyrosine kinase target 
30 when IFN7 is the ligand. Thus two different ligands acting through two different 
receptors both use these family members. With only a modest number of family 
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members and combinatorial use in response to different ligands, this family of 
proteins becomes an even more likely possibility to represent a general link 
between ligand-occupied receptors and transcriptional control of specific genes in 
the nucleus. 

5 

Further study of the 113, 91 and 84 kD proteins of the present invention has 
revealed that they are phosphorylated in response to treatment of cells with IFNa 
(Figure 11). Moreover, when the phosphoamino acid is determined in the newly 
phosphorylated protein the amino acid has been found to be tyrosine (Fig. 12). 
10 This phosphorylation has been observed to disappear after several hours, indicating 
action of a phosphatase of the 113, 91 and 84 kD proteins to stop transcription. 
These results show that EFN dependent transcription very' likely demands this 
particular phosphorylation and a cycle of interferon-dependent phosphorylation- 
dephosphorylation is responsible for controlling transcription. 

It is proposed that other members of the 113-91 protein family will be identified as 
phosphorylation targets in response to other ligands. If as is believed, the tyrosine 
phosphorylation site on proteins in this family is conserved, one can then easily 
determine which family members are activated (phosphorylated), and likewise the 
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20 particular extracellular polypeptide ligand to which that family member is 
' responding. The modifications of these proteins (phosphorylation and 

dephosphorylation) enables the preparation and use of assays for determining the 
effectiveness of pharmaceuticals in potentiating or preventing intracellular 
responses to various polypeptides, and such assays are accordingly contemplated 
25 within the scope of the present invention. 

EXAMPLE 4 



30 



Identification of murine 91 kP protein 

A fragment of the gene encoding the human 91 kD protein was used to screen a 
murine thymus and spleen cDNA library for homologous proteins. The screening 
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assay yielded a highly homologous gene encoding a murine polypeptide that is 
greater than 95% homologous to the human 91 kD protein. The nucleic acid and 
deduced jmino acid sequence of the murine 91 kD protein are shown in Figure 
1^^12G ; and SEQ ID NO: 7 (nucleotide sequence) and SEQ ED NO: 8 (amino acid 
sequence). 



EXAMPLE 5 
Additional Members o f The 113-91 Protein Family 

Using a 300 nuclide fragment amplified by PGR from the 8112 region of the 
murine 91kD protein gene, murine genes encoding two additional members of the 
113-91 family of receptor recognition factor proteins were isolated from a murine 
splenic/thymic cDNA library according to the method of Sambrook et al, (1989, 
Molecular Cloning, A Laboratory Manual, 2nd. ed,, Cold Spring Harbor Press: 
Cold Spring Harbor, New York) constructed in the ZAP vector. Hybridization 
was carried out at 42^*0 and washed at 42^*0 before the first exposure (Church and 
GUbert, 1984, Proc. Natl. Acad. Sci. USA 81:1991-95). Then the filters were 
washed in 2X SSC, 0.1% SDS at 65^C for a second exposure. Statl clones 
survived the 65 ''C washing, whereas Stat3 and Stat4 clones were identified as 
plaques that lost signals at 65 ''C. The plaques were purified and subcloned 
according to Stratagene commercial protocols. 

This probe was chosen to screen for other STAT family members because, while 
Statl and Stat2 SH2 domains are quite similar over the entire 100 to 120 amino 
acid region, only the amino terminal half of the STAT SH2 domains strongly 
resemble the SH2 regions found in other proteins. 

The two genes have been cloned into plasmids 13sfl and 19sf6. The nucleotide 
sequence, and deduced amino acid sequence, for the 13sfl and 19sf6 genes are 
shown in Figures 14 and 15, respectively. These proteins are alternatively termed 
Stat4 and Stat3, respectively. 
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Comparison with the sequence of Stat91 (Statl) and Statll3 (Stat2) shows several 
highly conserved regions, including the putative SH3 and 8112 domains. The 
conserved amino acid stretches likely point to conserved domains that enable these 
proteins to carry out transcription activation functions. Stat3, like Statl (Stat91), 
is widely expressed, while Stat4 expression is limited to the testes, thymus, and 
spleen. Stat3 has been found to be activated as a DNA binding protein through 
phosphorylation on tyrosine in cells treated with EGF or IL-6, but not after IFN-7, 
treatment. 

Both the 13sfl and 19sf6 genes share a significant homology with the genes 
encoding the human and murine 91 kD protein. There is corresponding homology 
between the deduced amino acid sequences of the ISsfl and 19sf6 proteins and the 
amino acid sequences of the human and murine 91 kD proteins, although not the 
greater than 95% amino acid homology that is found between the murine and 
human 91 kD proteins. Thus, though clearly of the same family as the 91 kD 
protein, the 13sfl and 19sf6 genes encode distinct proteins. 

The chromosomal locations of the murine STAT proteins (1-4) have been 
determined: Statl and Stat4 are located in the centromeric region of mouse 
chromosome 1 (corresponding to human 2q 32-34q); the two other genes are on 
other chromosomes. 

Southern analysis using probes derived from 13sfl and 19sf6 on human genomic 
libraries have established that genes corresponding to the murine 13sfl and 19sf6 
genes are found in humans. 

Tissue distribution of mRNA expression of these genes was evaluated by Northern 
hybridization analysis. The results of this distribution analysis are shown in the 
following Table. 



TABLE 
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DISTRIBUTION OF mRNA EXPRESSION OF 13sfl, 19sf6, 91 kD PROTEINS 



ORGAN 


13sfl 


19sf6 


91 KD 


BRAIN 


- 


+ 


- 


HEART 


- 


+ + + 


- 


KIDNEY 


- 


- 


- 


LIVER 


- 


+ 


+ 


LUNG 


- 


- 


- 


SPLEEN 


+ 


+ 


+ + + + 


TESTIS 


+ + + + 


+ + 


N.A. 


THYMUS 


+ + 


+ + 


+ + + 


EMBRYO (16d) 


not found 


found 


found 



Northern analysis demonstrates that there is variation in the tissue distribution of 
expression of the mRNAs encoded by these genes. The variation and tissue 
distribution indicates that the specific genes encode proteins that are responsive to 
different factors, as would be expected in accordance with the present invention. 
The actual ligand, the binding of which induces phosphorylation of the newly 
discovered factors, will be readily determinable based on the tissue distribution 
evidence described above. 

To determine whether the Stat3 and Stat4 proteins were present in cells, protein 
blots were carried out with antisera against each protein. The antisera were 
obtained by subcloning amino acids 688 to 727 of Stat3 and 678 to 743 of Stat4 to 
pGEXlXt (Pharmacia) by PCR with oligonucleotides based on the boundary 
sequence plus restriction sites (BamHI at the 5' end and EcoRI at the 3' end), 
allowing for in-frame fusion with GST. One milligram of each antigen was used 




for the immunization and three booster injections were given 4 weeks apart. Anti- 
Stat3 and anti-Stat4 sera were used 1:1000 in Western blots using standard 
protocols. To avoid cross reactivity of the antisera, antibodies were raised against 
the C-terminal of Stat3 and Stat4, the less homologous region of the protein. 

5 

These proteins were unambiguously found in several tissues where the mRNA wan 
known to be present. Protein expression was checked in several cell lines as well. 
A protein of 89 kD reactive with Stat4 antiserum was expressed in 70Z cells, a 
preB cell line, but not in many other cell lines. Stat3 was highly expressed, 
10 predominantly as a 97 kD protein, in 70Z, HT2 (a mouse helper T cell clone), and 
U937 (a macrophage-derived cell). 

To prove that the full length functional cDNA clones of Stat3 and Stat4 were 
obtained, the open reading frames of each cDNA was independently (/.e., 

15 separately) cloned into the Rc/CMV expression vector (Invitrogen) downstream of 
a CMV promoter. The resulting plasmids were transfected into COSl cells and 
proteins were extracted 60 hrs post-transfection and examined by Western blot 
after electrophoresis. Untransfected COSl cells expressed a low level of 97 kD 
Stat3 protein but did not express a detectable level of Stat4. Upon transfection of 

20 the Stat3-expressing plasmid, the 97 kD Stat3 was increased at least 10-fold. And 
89 kD protein antigenically related to Stat3, found as a minor band in most cell 
line extracts, was also increased post-transfection. This protein therefore appears 
to represent another form of Stat3 protein, or an antigenically similar protein 
whose synthesis is stimulated by Stat3. Transfection with Stat4 led to the 

25 expression of a 89 kD reactive band indistinguishable in size form the p89 Stat4 
found in 70Z cell extracts. i 

DISCUSSION 



30 



As mentioned earlier, the observation and conclusion underlying the present 
invention were crystallized from a consideration of the results of certain 
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investigations with particular stimuli. Particularly, the present disclosure is 
illustrated by the results of work on protein factors that govern transcriptional 
control of IFNot-stimulated genes, as well as more recent data on the regulation of 
transcription of genes stimulated by IFN7. The present disclosure is further 
illustrated by the identification of related genes encoding protein factors responsive 
to as yet unknown factors. It is expected that the murine 91 kD protein is 
responsive to IFN-7. 

For example, the above represents evidence that the 91 kD protein is the tyrosine 
kinase target when IFN7 is the ligand. Thus two different ligands acting through 
two different receptors both -use these family members. With only a modest 
number of family members and combinatorial use in response to different ligands, 
this family of proteins becomes an even more likely possibility to represent a 
general link between ligand-occupied receptors and transcriptional control of 
specific genes in the nucleus. 

It is proposed and shown by the foregoing that other members of the 113-91 
protein family will be and have been identified as phosphorylation targets in 
response to other ligands. If as is believed, the tyrosine phosphorylation site on 
proteins in this family is conserved, one can then easily determine which family 
members are activated (phosphorylated), and likewise the particular extracellular 
polypeptide ligand to which that family member is responding. The modifications 
of these proteins (phosphorylation and dephosphorylation) enables the preparation 
and use of assays for determining the effectiveness of pharmaceuticals in 
potentiating or preventing intracellular responses to various polypeptides, and such 
assays are accordingly contemplated within the scope of the present invention. 

Earlier work has concluded that DNA binding protein was activated in the cell 
cytoplasm in response to IFN-7 treatment and that this protein stimulated 
transcription of the GBP gene (10,14). In the present work, with the aid of 
antisera to proteins originally studied in connection with IFN-a gene stimulation 
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(7,12,15), the 91 kD ISGF-3 protein has been assigned a prominent role in IFN-7 
gene stimulation as well. The evidence for this conclusion included: 1) antisera 
specific to the 91 kD protein affected the IFN-7 dependent gel-shift complex, and 
2) A 91 kD protein could be cross-linked to the GAS IFN-7 activated site. 3) A 
"S-labeled 91 kD protein and a 91 kD immunoreactive protein specifically purified 
with the gel-shift complex. 4) The 91 kD protein is an IFN-7 dependent tyrosine 
kinase substrate as indeed it had earlier proved to be in response to IFN-a (15). 
5) The 91 kD protein but not the 113 kD protein moved to the nucleus in response 
to IFN-7 treatment. None of these experiments prove but do strongly suggest that 
the same 91 kD protein acts differently in different DNA binding complexes that 
are triggered by either IFN-a or IFN-7. 



These results strongly support the hypothesis originated from studies on IFN-a that 
polypeptide cell surface receptors report their occupation by extracellular ligand to 
latent cytoplasmic proteins that after activation move to the nucleus to trigger 
transcription (4,15,21). Furthermore, because cytoplasmic phosphorylation and 
factor activation is so rapid it appears likely that the functional receptor complexes 
contain tyrosine kinase activity. Since the IFN-7 receptor chain that has been 
cloned thus far (22) has no hint of possessing intrinsic kinase activity, perhaps 
some other molecule with tyrosine kinase activity couples with the IFN-7 receptor. 
Two recent results with other receptors suggest possible parallels to the situation 
with the IFN receptors. The trk protein which has an intracellular tyrosine kinase 
domain, associates with the NGF receptor when that receptor is occupied (23). In 
addition, the Ick protein, a member of the src family of tyrosine kinases, is 
co-precipitated with the T ceU receptor (24). It is possible to predict that signal 
transduction to the nucleus through these two receptors could involve latent 
cytoplasmic substrates that form part of activated transcription factors. In any 
event, it seems possible that there are kinases like trk or Ick associated with the 
IFN-7 receptor or with IFN-a receptor. 




With regard to the effect of phosphorylation on the 91 kD protein, it was 
something of a surprise that after IFN-7 treatment the 91 kD protein becomes a 
DNA binding protein. Its role must be different in response to IFN-a treatment. 
Tyrosine is also phosphorylated on tyrosine and joins a complex with the 113 and 
84 kD proteins but as judged by UV cross-linking studies (7), the 91 kD protein 
does not contact DNA. 

In addition to becoming a DNA binding protein it is clear that the 91 kD protein is 
specifically translocated the nucleus in the wake of IFN-7 stimulation. 

EXAMPLE^DIMERIZATION OF PHOSPHORYLATED STAT91 

Stat91 (a 91 kD protein that acts as a signal iransducer and activator of 
transcription) is inactive in the cytoplasm of untreated cells but is activated by 
phosphorylation on tyrosine in response to a number of polypeptide ligands 
including IFN-a and IFN-7. This example reports that inactive Stat91 in the 
cytoplasm of untreated cells is a monomer and upon IFN-7 induced 
phosphorylation it forms a stable homodimer. The dimer is capable of binding to 
a specific DNA sequence directing transcription. Dissociation and reassociation 
assays show that dimerization of Stat91 is mediated through SH2-phosphotyrosyl 
peptide interactions. Dimerization involving SH2 recognition of specific 
phosphotyrosyl peptides may well provide a prototype for interactions among 
family members of STAT proteins to form different transcription complexes and 
Jak2 for the IFN-7 pathway (42, 43, 44). These kinases themselves become 
tyrosine phosphorylated to carry out specific signaling events. 

Materials and Methods 



Cell Culture. Human 2fTGH, U3A cells were maintained in DMEM medium 
supplied with 10% bovine calf serum. U3A cell lines supplemented with various 
Stat91 protein constructs were maintained in 0.1 mg/ml G418 (Gibco, BRL). 
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Stable cell lines were selected as described (45). IFN-7(5 ng/ml, gift from 
Amgen) treatment of cells was for 15 min. unless otherwise noted. 

Plasmid Constructions, Expression construct MNC-84 was made by insertion of 
the cDNA into the Not I-Bam HI cloning site of an expression vector PMNC (45, 
35). MNC-91L was made by insertion of the Stat91 cDNA into the Not I -Bam 
HI cloning sites of pMNC without the stop codon at the end, resulting the 
production of a long form of Stat91 with a C-terminal tag of 34 amino acids 
encoded by PMNC vector. 

GST fusion protein expression plasmids were constructed- by the using the pGEX- 
4;j 2T vector (Pharmacia). GST-91SH2 encodes amino acids 573 to 672 of Stat91; 

GST-91mSH2 encodes amino acids 573 to 672 of Stat91 with an Arg-602-> Leu- 
10 602 mutation; and GST-91SH3 encodes amino acids 506 to 564 of Stat91. 

!J 15 
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DNA Transfection, DNA transfection was carried by the calcium phosphate 
method, and stable cell lines were selected in Dulbecco's modified Eagle's 
riJ medium containing G418 (0.5 mg/ml, Gibco), as described (45). 

m 
m 

20 Preparation of Cell Extracts, Crude whole cell extracts were prepared as 

described (31). Cytoplasmic and nuclear extracts were prepared essentially as 
described (46). 

Affinity Purification. Affinity purification with a biotinylated oligonucleotide was 
25 described (31). The sequence of the biotinylated GAS oligonucleotide was from 
the Ly6E gene promoter (34). 

Nondenaturing Polyacrylamide Gel Analysis. A nondenatured protein molecular 
weight marker kit with a range of molecular weights from 14 to 545 kD was 
30 obtained from Sigma. Determining molecular weights using nondenaturing 

polyacrylamide gel was carried out following the manufacturer's procedure, which 
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is a modification of the methods of Bryan and Davis (47, 48). Phosphorylated and 
unphosphorylated Stat91 samples obtained from affinity purification using a 
biotinylated GAS oligonucleotide (31) were resuspended in a buffer containing 10 
mM Tris (pH 6.7), 16% glycerol, 0,04% bromphenol blue (BPB). The mixtures 
5 were analyzed on 4.5%, 5,5%, 6.5%, and 7,5.% native gels side by side with 
standard markers using a Bio-Rad mini-Protean n Cell electrophoresis system. 
Electrophoresis was stopped when the dye (BPB) reached the bottom of the gels. 
The molecular size markers were revealed by Coomassie blue staining. 
Phosphorylated and unphosphorylated Stat91 samples were detected by 
10 immunoblotting with anti-91T. 

Glycerol Gradient Analysis. Cells extracts (Bud 8) were mixed with protein 
standards (Pharmacia) and subjected to centrifugation through preformed 10%- 
40% glycerol gradients for 40 hours at 40,0(X) rpm in an SW41 rotor as described 
15 (6). 



Gel Mobility Shift Assays, Gel mobility shift assays were carried out as described 
(34). An oligonucleotide corresponding to the GAS element from the human 
FC7RI receptor gene (Pearse et al, 1993) was synthesized and used for gel 
Q 20 mobility shift assays. The oligonucleotide has the following sequence: 

5'GATCGAGATGTATTrCCCAGAAAAG3' (SEQ. ID NO: 17). 

Synthesis of Peptides. Solid phase peptide synthesis was used with either a 
DuPont RAMPS multiple synthesizer or by manual synthesis, C-terminal amino 
25 attached to Wang resin were obtained from DuPont/NEN. All amino acids were 

i 

coupled as the N-Fmoc pentafluorophenyl esters (Advanced Chemtech), except for 
N-Fmoc, PO-dimethyl-L-phosphotyrosine (Bachem). Double couplings were used. 
Cleavage from resin and deprotection used thioanisol/m-cresol/TFA/TMSBr at 
4''C for 16 hr. Purification used C-18 column HPLC with 0.1% TFA/acetonitrile 
30 gradients. Peptides were characterized by *H and ^*P NMR, and by Mass Spec, 
and were greater than 95 % pure. 
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Guanidium Hydrochloride Treatment. Extracts were incubated with guanidium 
hydrochloride (final concentration was 0.4 to 0.6 M) for two min. at room 
temperature and then diluted with gel shift buffer (final concentration of guanidium 
hydrochloride was 100 mM) and incubated at room temperature for 15 min. «P- 
5 labeled GAS oligonucleotide probe was then added direcUy to the mixture foUowed 
by gel mobility shift assay. 

Dissociation-reassociation Analysis. Extracts were incubated with various 
concentrations of peptides or fusion proteins, and "P-labeled GAS oligonucleotide 
10 probe in gel shift buffer was then added to promote the formation of protein- 
DNA complex followed by mobility shift analysis. This-assay did not involve 
guanidium hydrochloride treatment. 

Preparation of Fusion Proteins. Bacterially expressed GST ftision proteins were 
15 purified using standard techniques, as described in Birge et al., 1992. Fusion 
proteins were quantified by O.D. absorbance at 280nm. Aliquotes were frozen 
at -70°C. 
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Results 



Detection ofLigand Induced Dimer Formation of Stat91 in Solution. In untreated 
cells, Stat91 is not phosphorylated on tyrosine. Treatment with IFN-7 leads 
within minutes to tyrosine phosphorylation and activation of DNA binding 
capacity. The phosphorylated form migrates more slowly during electrophoresis 
under denaturing conditions affording a simple assay for the phpsphoprotein (31). 



To determine the native molecular weights of the phosphorylated and 
unphosphorylated forms of Stat91, we separated them by affinity purification using 
a biotinylated deoxyoHgonucleotide containing a GAS sequence (interferon gamma 
30 activation site) (Figure 16 A). The separation of phosphorylated Stat91 from the 
unphosphorylated form was efficient as almost all detectable phosphorylated form 
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could bind to the GAS site whUe unphosphorylated Stat91 remained unbound. To 
determine the molecular weights of the purified phosphorylated Stat91 and 
unphosphorylated Stat91, samples of each were then subjected to electrophoresis 
through a set of nondenaturing gels containing various concentrations of 
5 acrylamide foUowed by Western blot analysis (Figure 16B). Native protein size 
markers (Sigma) were included in the analysis. 

This technique was originally described by Bryan (48) and was recently used for 
dimer analysis (49). The logic of the technique is that increasing gel 
10 concentrations affect the migration of larger proteins more than smaller proteins, 
and the analysis is not affected by modifications such as protein phosphorylation 
I (49). 



m 



A function of the relative mobilities (Rm) was plotted versus the concentration of 
15 acrylamide for each sample to construct Ferguson plots (Figure 16C). The 
logarithm of the retardation coefficient (calculated from Figure 16C) of each 
sample was then plotted against the logarithm of the relevant molecular weight 
range (Figure 16D). By extrapolation of its retardation coefficient (Figure 16D), 
the native molecular weight of Stat91 from untreated cells was estimated to be 
20 approximately 95 kD, while tyrosine phosphorylated Stat91 was estimated to be 
about twice as large, or approximately 180 kD. Because the calculated molecular 
weight from amino acid sequence of Stat91 is 87 kD, and Stat91 migrates on 
denaturing SDA gels with an apparent molecular weight of 91 kD (see supra, and 
refs. 12 and 45), we concluded that in solution, unphosphorylated Stat91 existed as 
25 a monomer while tyrosine phosphorylated Stat91 is a dimer. . 

We also employed glycerol gradient analysis to estimate the native molecular 
weights of both phosphorylated and unphosphorylated Stat91 (Figure 17). Whole 
cell extract of fibroblast cells (Bud8) treated with IFN-7 were prepared and 
30 subjected to sedimentation through a 10-40% glycerol gradient. Fractions from 
the gradient were collected and analyzed by both immunoblotting and gel mobility 
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shift analysis (Figure 17 A and 17B). As expected, two electrophoretic forms of 
Stat91 could be detected by immunoblotting (Figure 17A): the slow-migrating 
form (tyrosine phosphorylated) and the fast-migrating form (unphosphorylated; 
Figure 17A). The phosphorylated Stat91 sedimented more rapidly than the 
5 unphosphorylated form. Again, using molecular weight markers, the native 

molecular weight of the unphosphorylated form of Stat91 appeared to be about 90 
kD while the tyrosine phosphorylated form of Stat91 was about 180kD (Figure 
17C), supporting the conclusion that unphosphorylated Stat91 existed as a 
monomer in solution while the tyrosine phosphorylated form exists as a dimer. 
10 When fractions from the glycerol gradients were analyzed by electrophoretic 

mobility shift analysis (Figure 17B), the peak of the phosphorylated form of Stat91 
correlated well with the DNA-binding activity of Stat91. Thus only the 
phosphorylated dimeric Stat91 has the sequence-specific DNA recognition 
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Stai91 Binds DNA as a Dimer. Long or short versions of DNA binding protein 



ip can produce, respectively, a slower or a faster migrating band during gel 

retardation assays. Finding intermediate gel shift bands produced by mixing two 



i^f different sized species provides evidence of dimerization of the DNA binding 

i^l 20 proteins. Since Stat91 requires specific tyrosine phosphorylation in ligand-treated 

cells for its DNA binding, we sought evidence of formation of such heterodimers, 
first in transfected cells. An expression vector (MNC911) encoding Stat91L, a 
recombinant form of Stat91 containing an additional 34 amino acid carboxyl 
terminal tag was generated, [The extra amino acids were encoded by a segment of 
25 DNA sequence from plasmid pMNC (see Materials and Methods).] A Stat84 
expression vector (MNC84) was also available (45). From sonlatic cell genetic 
experiments, mutant human cell lines (U3) are known that lack the Stat91/84 
mRNA and proteins (29,30). The U3 cells were therefore separately transfected 
with vectors encoding Stat84 (MNC84) or Stat91L (MNC91L) or a mixture of 
30 both vectors. Permanent transfectants expressing Stat84 (C84), Stat91L (C91L) or 
both proteins (Cmx) were isolated (Figure 18 A). 
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Mobility shift analysis was performed with extracts from these stable cell lines 
(Figure 18B). Extracts of IFN-7-treated C84 cells produced a faster migrating gel 
shift band than extracts of treated C91L cells. Most importantly, extracts from 
IFN-7-treated Cmx cells expressing both Stat84 and Stat91L proteins formed an 
5 additional intermediate gel shift band. Anti-91, an antiserum against the C- 
terminal 38 amino acids of Stat91 (12) that are absent in Stat84, specifically 
removed the top two shift bands seen with the Cmx extracts. Anti-91, an 
antiserum against amino acids 609 to 716 (15) that recognizes both Stat91L and 
Stat84, proteins inhibited the binding of all three shift bands. Thus, the middle 
10 band formed by extracts of the Cmx cells is clearly identified as a heterodimer of 
Stat84 and Stat91L. We concluded that both Stat91 and Stat84 bind DNA as 
homodimers and, if present in the same cell, will form heterodimers. 

We next wanted to detect the formation of dimers in vitro. When cytoplasmic or 
15 nuclear extracts of IFN-7-treated C84 or C91L cells were mixed and analyzed 
(Figure 19), only the fast or slow migrating gel shift bands were observed. Thus 
it appeared that once formed in vivo , the dimers were stable. To promote the 
formation of protein interchange between the subunits of the dimer, a mixture of 
either cytoplasmic or nuclear extracts of IFN-7-treated C84 or C91L cells were 
,''^1 20 subjected mild denaturation-renaturation treatment: extracts were made 0.5 M 

with respect to guanidium hydrochloride for two minutes and then diluted for 
renaturation and subsequently used for gel retardation analysis. The formation of 
heterodimer was clearly detected after this treatment. When extracts from either 
C84 cells alone or C91L cells alone were subjected to the same treatment, the 
25 intermediate band did not form. The intermediate band was again proven by 
antiserum treatment to consist of Stat84/Stat91L dimer (data not shown). 
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This experiment defined conditions under which the dimer was stable, but also 
showed that dissociation and reassociation of the dimer in vitro was possible. 
Since guanidium hydrochloride is known to disrupt only non-covalent chemical 
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bonds, it seemed that Stat91 (or Stat84) homodimerization was mediated through 
non-covalent interactions. 

Dimerization ofStat91 Involves Phosphotyrosyl Peptide and SH2 Interactions. 
5 Based on the results described above, we devised a dissociation-reassociation assay 
in the absence of guanidium hydrochloride to explore the possible nature of 
interactions involved in dimer formation (Figure 20). When the short and the long 
forms of a homodimer are mixed with a dissociating agent (e.^., a peptide 
containing the putative dimerization domain), the subunits of the dimer should 
10 dissociate (in a concentration dependent fashion) due to the interaction of the agent 
with the dimerization domaih(s) of the protein. When a"specific DNA probe is 
subsequently added to the mixture to drive the formation of a stable protein-DNA 
complex, the detection of any reassociated or remaining dimers can be assayed. In 
the presence of low concentration of the dissociating agent, addition of DNA to 
15 form the stable protein-DNA complex should lead to the detection of homodimers 
as well as heterodimers. At high concentration of the dissociating agent, subunits 
of the dimer may not be able to re-form and no DNA-protein complexes would be 
detected (Figure 20). 
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20 The Stat91 sequence contains an SH2 domain (amino acids 569 to 700, see 
discussion below), and we knew that Tyr-701 was the single phosphorylated 
tyrosine residue required for DNA binding activity {supra, 45). Furthermore, we 
have observed that phosphotyrosine at 10 mM, but not phosphoserine or 
phosphothreonine, could prevent the formation of Stat91-DNA complex. We 

25 therefore sought evidence that the dimerization of Stat91 involved sf>ecific SH2- 
phosphotyrosine interaction using the dissociation and reassociation assay. 
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In order to evaluate the role of the SH2-phosphotyrosine interation, two peptides 
fragments of Stat91 corresponding to segments of the SH2 and phosphotyrosing 
domains of Stat91 were prepared: a non-phosphorylated peptide (91Y), 
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LDGPKGTGYIKTELI (SEQ. ID NO: 18) (corresponding to amino acids 693-707), 
and a phosphotyrosyl peptide (91Y-p), GY*IKTE (SEQ, ID 
NO: 19) (representing residues 700-705), 

Activated Stat84 or Stat91L was obtained from IFN-7-treated C84 or C91L cells 
and mixed in the presence of various concentrations of the peptides followed by 
gel mobility shift analysis. The non-phosphorylated peptide had no effect on the 
presence of the two gel shift bands characteristic of Stat84 or Stat91L homodimers 
(Figure 21, lane 2-4). In contrast, the phosphorylated peptide (91Y-p) at the 
concentration of 4 /iM clearly promoted the exchange between the subunits of 
Stat84 dimers and Stat91L di'mers to form heterodimers (Figure 21, lane 5). At a 
higher concentration (160 /xM), peptide 91Y-p but not the unphosphorylated 
peptide dissociated the dimers and blocked the formation of DNA protein 
complexes (Figure 21, lane 7). 

When cells are treated with IFN-a both Stat91 (or 84) and Statll3 become 
phosphorylated (15). Antiserum to Statll3 can precipitate both Statll3 and Stat91 
after IFN-a-treatment but not before, suggesting IFN-a dependent interaction of 
these two proteins, perhaps as a heterodimer (15). 

In Statin, tyr-690 in the homologous position to Tyr-701 in Stat91 is the single 
target residue for phosphorylation. Amino acids downstream of the affected 
tyrosine residue show some homology between the two proteins. We therefore ' 
prepared a phosphotyrosyl peptide of StatllB (113Y-p), KVNLQERRKY^LKHR 
(SEQ. ID NO:20) [amino acids 681 to 694; (38)]. At concentrations similar to 
91Y-p, 113Y-P also promoted the exchange of subunits between the Stat84 and 
Stat91L, while at a high concentration (40/iM), 113Y-p prevented the gel shift 
bands almost completely (Figure 21, lane 8-10), 

We prepared a phosphotyrosyl peptide (SrcY-p), EPQY*EEIPIYL (SEQ. ID 
NO:21) which is known to interact with the Src SH2 domain with a high affinity 




(50). This peptide showed no effect on the Stat91 dimer formation (Figure 21, 
lane 11-13). Thus, it seems that Stat91 dimerization involves SH2 interaction with 
tyrosine residues in specific peptide sequence. 



5 To test further the specificity of Stat91 dimerization mediated through specific- 
phosphotyrosyl-peptide SH2 interaction, a fusion product of glutathione-S- 
transferase with the Stat91-SH2 domain (GST-91SH2) was prepared (Figure 22A) 
and used in the in vitro dissociation reassociation assay. At concentrations of 0.5 
to 5 uM, the Stat91-SH2 domain promoted the formation of a heterodimer (Figure 
10 22B, lanes 5-7). In contrast, neither GST alone, nor fusion products with a 

mutant (R^->L^) Stat91-SH2 domain (GST 91mSH2) that renders Stat91 non- 
Cl functional in ^dvo, a Stat91 SH3 domain (GST-91SH3), nor the Src SH2 domain 

(GST-SrcSH2), induced the exchange of subunits between the Stat84 and Stat91L 
homodimers (Figure 22B). 
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The initial sequence analysis of the Stat91 and Statll3 proteins revealed the 



:i presence of SH2 like domains (see 13,38). Further it was found that STAT 

20 proteins themselves are phosphorylated on single tyrosine residues during their 

activation (15,31). Single amino acid mutations either removing the Stat91 
phosphorylation site, Tyr-701, or converting Arg-702 to Leu in the highly 
conserved "pocket" region of the SH2 domain abolished the activity of Stat91 (45). 
Thus it seemed highly likely that one possible role of the STAT SH2 domains 

25 would be to bind the phospho tyrosine residues in one of the JAK kinases. 

i 

Since the activated STATs have phosphotyrosine residues and SH2 domains, a 
second suggested role for SH2 domains was in protein-protein interactions within 
the STAT family. By two physical criteria — electrophoresis in native gels and 
30 sedimentation on gradients — Stat91 in untreated cells is a monomer and in treated 
cells is a dimer (Figures 16-18). Since phosphotyrosyl peptides from Stat91 or 
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Statin and the SH2 domain of Stat91 could efficiently promote the formation of 
herterodimers between Stat91L and Stat84 in a disassociation and reassociation 
assay, we conclude that dimerization of Stat91 involves SH2-phosphotyrosyI 
peptide interactions. 



The possibility of an SH2 domain in Stat91 was indicated initially by the presence 
of highly conserved amino acid stretches between the Stat91 and Statll3 sequences 
in the 569 to 700 residue region, several of which, especially the FLLR sequence 
in the amino terminal end of the region, are characteristic of -SH2 domains. The 
C-terminal half of the SH2 domains are less well conserved in general (39); this 
was also true for the STAT proteins compared to other proteins, although Stat91 
and Status are quite similar In this region (38, 13, Figure 23), The available 
structures of Ick, src, abl, and p85a SH2's permit identification of structurally 
conserved regions (SCR's), and detailed alignment of amino acid sequences of 
several proteins (Figure 23) is based on these. 

The characteristic W (in BAl) is preceded by hydrophilic residues and is followed 
by hydrophobic residues in Stat91, but alignment to the W seems justified, even if 
the small beta sheet of which the W is part is shifted in Stat9L The three 
positively charged residues contributing to the phosphotyrosyl binding site are at 
the positions indicated as alphaA2, betaBS, and betaD5. Figure 23 shows an 
alignment which accomplishes this by insertions in the *AA' and 'CD' regions. 
This is a different alignment from that previously suggested (38), and gives a 
satisfactory alignment in the (beta)D region, although, like the previous alignment, 
it is obviously considerably less similar to the other SH2's in the C-terminus. 

i 

This alignment suggests that the SH2 domain in the Stat91 would end in the 
vicinity of residue 700. In such an alignment, the Tyr-701 occurs almost 
immediately after the SH2 domain: a distance too short to allow an intramolecular 
phosphotyrosine -SH2 interaction. Since the data presented earlier strongly 
implicate that an SH2 -phosphotyrosine interaction is involved in dimerization, such 
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an interaction is likely to be between two phospho Stat91 subunits as a reciprocal 
pTyr -SH2 interaction. 

The apparent stability of Stat91 dimer may be due to a high association rate 
coupled with a high dissociation rate of SH2-phosphotyrosyl peptide interactions as 
suggested (Felder et al., 1993, Mol. CeU BioL 13:1449-1455) coupled with 
interactions between other domains of Stat91 that may contribute stability to the 
Stat91 dimer. Interference by homologous phosphopeptides with the -SH2- 
phosphotyrosine interaction would then lower stability sufficiently to allow 
complete dissociation and heterodimerization. 

The dimer formation between phospho Stat91 is the first case in eukaryotes where 
dimer formation is regulated by phosphorylation, and the only one thus far 
dependent on tyrosine phosphorylation. We anticipate that dimerization with the 
STAT protein family will be important. It seems likely that in cells treated with 
IFN-a, there is Statl 13-Stat91 interaction (15). This may well be mediated 
through SH2 and phosphotyrosyl peptide interactions as described above, leading 
to a complex (a probable dimer of Stat91-Statll3) which joins with a 48 kD DNA 
binding protein (a member of another family of DNA binding factors) to make a 
complex capable of binding to a different DNA site. Furthermore, we have 
recenUy cloned two mouse cDNAs which encode other STAT family members that 
have conserved the same general structure features observed in the Stat91 and 
Statin molecules (see Example 5, Supra). (U.S. Application Serial No, 
08/126,588, filed September 29,1993, which is specifically incorporated herein by 
reference in its entirety). Thus the specificity of STAT-containing complexes wiU 
almost surely be affected by which proteins are phosphorylated knd then available 
for dimer formation. 

The following is a list of references related to the above disclosure and particularly 
to the experimental procedures and discussions. The references are numbered to 
correspond to like number references that appear hereinabove. 
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